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ABSTRACT 

\ 

This  report  describes  a  group  of  experiments  whose 
purpose  was  the  expansion,  refinement,  and  validation 
of  the  CACTOS  computer-network  analysis  model.  The 
experiments  were  conducted  on  both  existing  and  planned 
computer  networks  in  order  to  arrive  at  conclusions 
with  respect  to  computation  resources  and  to  obtain 
guidelines  for  use  in  the  design,  construction,  and 
modification  of  computer  networks. 

The  primary  issue  investigated  was  the  question  of 
whether,  in  general,  centralized  or  decentralized 
(distributed)  computational  power  offers  the  best 
potential  performance  and  cost-effectiveness  for 
present  and  future  computer  network  configurations. 

The  conclusion  was  that  partial  decentralization, 
using  large  computers  to  achieve  economies  of  scale,, 
provides  optimum  results  for  the  types  of  computer 
networks  that  will  be  constructed  to  meet  the  needs 
of  the  Department  of  Defense  during  the  1975-1980 
time  period.  (  , 
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1.  INTRODUCTION 

This  document  describes  the  final  results  and  analyses  undertaken  in  the  CACTOS 
(Computation  and  Communication  Trade-Off  Studies)  Project  supported  by  the 
Advanced  Research  Projects  Agency  (ARPA) .  The  goal  of  CACTOS  is  to  analyze,  and 
determine  the  relationships  among,  the  parameters  of  computer  -“"works  and  the 
performance  measures  of  such  networks.  The  parameters  include  the  hardware  and 
software  computation  parameters  as  well  as  those  associated  with  tie  commu¬ 
nication  network  intertying  the  computer  sites.  The  results  of  this  study  are 
intended  to  be  employed  by  Department  of  Defense  (DoD)  agencies  in  the  planning, 
design,  improvement ,  and  modification  of  military  computer  networks. 


These  objectives  have  been  achieved  and  are  described  herein.  One  aim  of  the 
Project  was  to  analyze  existing  networks.  The  other  to  determine  the  parameters 
and  performance  measures  of  importance,  as  well  as  the  future  requirements  tez 
information  handling  in  DoD  agencies,  with  regard  to  planned  networks.  To 
perform  the  trade-off  analyses  in  a  quantitative  manner,  an  analytic  modeling 
tool  was  constructed.  Programmed  in  FORTRAN  IV,  the  model  is  capable  of  handling 
both  computation  and  communication  parameters.  It  is  described  in  detail  in 


Appendix  A,  along  with  the  underlying  equations,  information  flow,  and 
assumptions. 


In  order  to  be  of  utility,  a  network  analysis  tool  must  be  validated  for  both 
computation  and  communication  analyses.  The  validation  of  the  software  and 
hardware  computation  analysis  is  described  in  Appendix  B;  the  communication 
analysis  was  drawn  from  previous  work,  and  its  validation  was  discussed  in  an 
interim  report.  Validation  was  also  performed  using  several  existing  systems 
considered  early, in  the  study. 


The  major  analysis  tasks  of  the  CACTOS  Project  are  described  in  Sections  2  and 
3.  The  two  sets  of  experiments  for  obtaining  the  general  relationships  are 
described  in  Section  2.1.  In  Section  2.2,  the  results  of  the  first  set  of 
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experiments  are  reviewed.  In  particular,  trade-offs  were  performed  among 
distributed  and  centralized  processing,  effect  of  core  and  job  size,  and  job 
type  as  measured  by  the  average  percentage  of  time  jobs  were  in  CPU  versus  I/O 
operations.  It  was  found  that  a  configuration  of  large  computers  distributed 
in  a  semi-centralized  configuration  was  more  cost-beneficial  than  either  widely 
distributed  processing  with  more,  smaller  machines  or  a  centralized  large 
computer  concentration;  this  preference  held  for  almost  all  realistic  parameter 
values.  Cost -effectiveness  was  viewed  as  the  ratio  of  workload  to  the  product 
of  monthly  total  cost  and  average  total  response  time.  The  superior  performance 
of  a  CPU-oriented  (e.g,,  scientific)  network  was  also  demonstrated.  The  most 
effective  core  sizes  tended  to  be  small  or  medium. 

The  second  set  of  experiments  (the  results  of  which  are  described  in  Section  2.3) 
was  oriented  toward  obtaining  guidelines  in  the  construction,  design,  and 
modification  of  networks.  In  these  experiments,  communication  lines  were  removed 
from  a  completely  connected  configuration  in  a  stepwise  fashion  based  on  the 
criteria  of  least  loaded  or  least  cost-effective  lines.  It  was  shown  that  these 
procedures  can  lead  to  more  cost-beneficial  configurations  than  some  typical 
configurations,  such  as  rings  and  stars. 

The  analysis  in  Section  3  confirms  some  of  the  analytic  conclusions  of  Section  2. 
This  section  presents  the  CACTOS  systems  analysis  work  for  one  present  and  two 
projected  networks.  These  analyses  have  a  broad  scope.  The  first  network  con¬ 
sidered  was  the  Marine  Corps  Personnel  System  (JUMPS/MMS),  which  is, connected 
through  AUTODIN.  The  CACTUS  analysis  revealed  how  system  performance  could  be 
enhanced  by  redistributing  some  of  the  data  bases  and  logic  of  the  network.  The 
second  network  analyzed  was  the  Air  Force's  proposed  Advanced  Logistics  System. 
Here,  optimal  channel  capacities  were  computed  along  with  measurements  of 
computation.  At  a  level  higher  than  analyses  of  existing  network  plans  is 
requirements  analyses  for  new  networks  whose  plans  have  not  yet  been  developed. 

To  explore  these  requirements  analyses,  a  General  Services  Administration 
request  for  proposals  for  a  computer  network  was  employed.  The  analysis 
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revealed  that  less  than  10%  of  the  fiscal  resources  should  be  spent  on 
communication.  Furthermore,  even  with  a  high  percentage  of  job  time  in 
input/output  operations,  it  is  most  c09t-benef icial  to  spend  as  much  in  the 
central  processing  unit  (as  opposed  to  core)  as  possible. 

The  optimal  configuration  was  r ami-centralized,  with  two  main  computer  centers 
spread  across  the  United  States.  These  results  are  in  close  agreement  with 
those  in  Section  2.2.  Both  sets  of  experiments  revealed  the  same  fiscal  per¬ 
centages  in  communication  and  CPU  for  the  optimal  configuration. 

Section  4  presents  some  recommendations  and  remarks  based  on  the  CACTOS  Project 
results.  These  recommendations  include  possible  new  directions,  such  as  network 
integration.  One  result  of  the  CACTOS  work,  namely  the  analytic  model,  is  now 
available  for  use  by  other  DoD  agencies  for  considering  specific  conditions  and 
networks . 
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2.  EXPERIMENTS 


2.1  FRAMEWORK  OF  EXPERIMENTS 

Two  sets  of  experiments  were  conducted.  Experiment  Set  I  was  designed  to 
examine  computation  characteristics.  Experiment  Set  II  was  aimed  at  network 
design  for  optimal  performance.  The  framework  for  each  of  these  is  described 
in  2.1.1  and  2.1.2  respectively. 

2.1.1  Experimen t  Set  I 

The  framework  for  the  experiments  whose  results  are  described  in  Section  2.2 
was  a  40-node  network  with  centers  located  in  the  cities  listed  in  Table  2-1 
(the  numbers  in  parentheses  indicate  the  number  of  centers  in  the;  given  city). 


TABLE  2-1 

CITIES  FOR  EXPERIMENT  SET  I 


Index  of  City 

City 

Index  of  City 

CitX 

1 

Seattle 

14 

Denver 

2 

Buffalo 

15 

Cincinnati 

3 

Boston  (2) 

16 

San  Francisco 

4 

Portland 

17 

Kansas  City 

5 

Milwaukee 

18 

St.  Louis 

6 

Minneapolis 

19 

Los  Angeles  (3) 

7 

Detroit  (2) 

20 

Phoenix 

8 

New  York  (6) 

21 

7itlanta 

9 

Chicago  (3) 

22 

San  Diego 

10 

Pittsburgh 

23 

Dallas 

11 

Philadelphia  (2) 

24 

New  Orleans 

12 

Cleveland 

25 

Houston 

13 

Was!  ington,  D.  C.  (3) 

26 

Miami 

These  cities  were  selected  in  part  from  geographic  distribution  and  in  part 
from  population  and  density  statistics.  Each  center  was  given  one  or  more 
specified  computers  tied  into  the  network  by  a  standard  modem  device.  The 
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network  was  assumed  to  be  of  the  message-switching  type.  The  assumptions  used 
to  perform  the  experiments  were  as  follows: 

1.  Remote  and  local  jobs  are  both  considered.  A  remote  job  consists  of 
a  message  transmission,  computation,  and  a  return  message.  A  local 
job  is  entirely  computation.  Messages  have  single  sources  and 
destinations . 

2.  All  messages  between  two  nodes  follow  the  path  with  the  minimum 
number  of  links  between  the  nodes  (fixed  minimum  path  routing).  Ties 
for  the  minimum  length  path  are  resolved  by  assignment  to  the  least 
loaded  path. 

3.  Message  and  job  arrival  distributions  are  assumed  to  be  negative 
exponential. 

4.  Interarrival  times  are  independent  of  message  lengths  and  job  sizes. 

5.  Nodes  behave  independently  of  each  other.  This  implies  infinite- 
capacity  message  buffers. 

6.  Nodo  switching  delays  are  fixed. 

7.  Nodes  have  an  infinite  traffic  capacity. 

8.  Multiprogramming  and  multiprocessing  are  not  specifically  accounted 

;  < 

for  In  the  model. 

9.  Message  transmission  is  assumed  to  be  error-free.  Retransmission  Is 
not  explicitly  taken  into  account. 
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Some  comments  on  these  assumptions  are  in  order.  The  fixed  minimum  path  routing 
has  been  shown  to  be  close  to  optimal  and  requires  much  less  computation  than 
is  required  for  calculations  of  the  optimal  case  (Frank  [3]),  although 
arrival  times  arc  more  closely  approximated  by  gamma  distributions.  However, 
in  many  cases  (especially  for  summed  gamma  distributions),  the  effective 
differences  between  the  gamma  and  exponential  distributions  have  been 
demonstrated  to  be  small.  Kleinrock  [4]  has  shown  that  this  occurs  when  all 
users  on  a  large  network  are  considered  simultaneously.  The  assumption  of 
infinite-capacity  message  buffers  has  been  shown  to  be  valid  when  the  network 
is  operating  at  less  than  80%  capacity.  In  a  network  in  an  unsaturated  state 
with  minimum  time  delay,  the  limitations  on  node  capacity  are  minor  (Kleinrock 
[4]). 


The  model  is  described  in  detail  in  Appendix  A. 
given  in  Figure  2-1.  The  inputs  and  outputs  to 


An  overview  of  the  model  is 
the  model  are  discussed  in  this 


section. 


The  capacity  of  each  communication  line  was  set  at  50kb .  This  channel  capacity 
bears  a  desirable  ratio  of  communication  to  computation  capacity  and  reflects 
the  optimum  ratio  of  communication  to  computation  costs  found  in  past  experi¬ 
ments  of  other  researches  for  second-generation  computing  experiment.  To  be 
representative  of  third-generation  capabilities,  the  configuration  of  the. 
network  was  based  on  the  assumption  that  the  articulation  of  the  network  was 
two — that  is,  at  least  two  links  must  be  broken  to  break  communication  between 
two  centers.  The  monthly  communication  cost  was  based  on  standard  available 
rates  for  50kb  line  size,  given  by  $15.00  for  each  of  the  first.  250  miles, 
$10.50  for  each  of  the  next  250  miles,  and  $7.50  for  each  mile  beyond  500 
miles.  A  minimum  cost  of  $250  per  jik  nth  per  line  was  also  assumed. 


The  general  network  topology  appears  as  in  Figure  2-2.  Distances  were  computed 
by  the  model  using  latitude  and  longitude  data  (computer  centers  in  the  same 
city  were  assumed  to  be  three  miles  apart).  The  networks  topology  was 
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constructed  iteratively,  where  at  each  step  the  most  highly  loaded  group  of 
links  was  found.  Then  a  link  was  added  to  reduce  this  load.  A  lightly  loaded 
link  elsewhere  in  the  network  was  deleted.  This  proceeded  in  an  enumerative  way 
until  no  further  improvement  was  possible. 

To  obtain  a  representative  set  of  costs  and  performance,  a  single  hardware 
manufacturer  and  generation  wet's  ce-twcted.  This  was  the  IBM  360  line  of 
machines.  This  line  was  chosen  because  of  the  broad  range  of  compatibly 
operating  equipment  and  a  consistency  of  costs  not  yet  present  in  the  latest 
product  lines.  The  machines  are  listed  in  Table  2-2. 

Since  experiments  were  aimed  in  part  at  the  type  of  the  job,  three  basic  types 
of  job  mixtures  were  assigned.  These,  along  with  their  parameters,  are  given  in 
Table  2-5  in  Section  2.1.2.  The  main  parameter  here  is  the  percentage  of  time  the 
average  job  spends  in  CPU.  This  ranges  from  90%  for  scientific  to  10%  for  commer¬ 
cial,  To  show  the  sensitivity  of  computer  throughput  to  core  memory  size,  and  to 
investigate  the  relationship  between  job  type  characteristics  and  memory, 
three  levels  of  immediate  memory  were  investigated  for  each  computer.  Details 
on  monthly  rental  price  and  performance  were  obtained  from  Keydata  [1]  and 
Auerbach  [2]. 

The  cost  and  configuration  information  on  the  computers  and  peripherals  used — 
360/20,  360/85,  and  360/195— appears  in  Tables  2-2  and  2-3.  The 
360/85  was  assumed  to  have  2314  disc  units,  while  the  360/195  was  given  3330 
units.  The  cost  information  reflects  the  costs  of  peripherals  and  the  mainframe 
computing  unit  on  a  monthly  rate. 


.  •- 


,,-r— ~~ 


L. 
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TABLE  2-2 

EXPERIMENT  SET  I  -  COMPUTER  COSTS 


Model 

360/85 

360/195 

360/20 


500 

1000 


2000 


Monthly  Cost  (000) 


$  85 
150 
200 


1000 

2000 

4000 


215 

258 

300 


16 


6 


TABLE  2-3 

TABLE  OF  COMPUTER  CHARACTERISTICS* 


360/85  360/195 

Job  processing  rate  (instruc.  microsec.)  6.25  21 

Memory  size  (thousands  of  bytes)  500,  1000,  2000  1000,  2000,  4000 

Word  size  (bits)  32  32 

Disk  transfer  rate  (bytes/microsec. )  312  806 

Average  disk  access  time  (millisec.)  87.5  38.5 

Disk  cylinder  size  (bytes)  146,000  247,600 

Average  I/O  record  size  (bytes)  7224  7224 


The  preceding  information  provides  the  general  topology  and  cost  framework. 
Remaining  to  be  specified  are  the  job  and  message  characteristics,  as  well  as 
the  specific  combinations  of  hardware  for  the  experiments. 


*  The  360/20  is  included  only  in  communication  costs,  and  its  computational 


characteristics  are  not  included  in  the  experiment. 
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Three  different  computation  configurations  were  assumed  for  the  experiments — 
distributed,  semi-centralized,  and  centralized  computing  power.  The  detailed 
assignments  of  computers  are  given  in  Table  2-4.  (The  number  in  parentheses 
refers  to  the  location  within  the  city).  In  all  cases,  the  360/20  computers 
act  as  concentrators  and  message  processors.  The  360/195  computers  are  split 
pairwise  in  a  dual  processing  mode.  In  each  case,  the  total  raw  throughput 
capacity  of  the  configurations  was  roughly  equivalent  (approximately  8  billion 
modified  bits  per  second).  For  the  360/196,  this  takes  into  account  a  15% 
loss,  resulting  from  executive  software  overhead  in  coordinating  the  dual 
processors. 


TABLE  2-4 

EXPERIMENT  SET  I  -  COMPUTATION  CONFIGURATION 


Configurations 

Distributed 

Semi-centralized 


Centralized 


Computer  Assignment 

360/85  at  all  sites 

360/195  -  2  at  each  of 
Boston  (2) 

New  York  (5) 

Chicago  (3) 

Washington,  D.  C.  (3) 

St.  Louis 
Los  Angeles  (1) 

360/20  -  other  centers 

360/195  -  2  at  each  of  the  6  New  York 
centers 

360/20  -  other  centers 


The  mixture  of  job  types  is  similar  to  that  experienced  with  a  general-purpose 
computer  utility  (on-line,  interactive  operations  meshed  with  remote-job- 
entry,  non-interactive  batch  processing).  These  job  types  were  deliberately 
chosen  to  emphasize  extremes  of  job  mixes  and  to  provide  information  about 
the  relative  merits  of  centralized  and  distributed  processing  power  for 
various  job  types.  Further,  the  job  configurations  are  directly  related  to 
the  throughput  efficiency  of  the  different  computer  configurations  also  being 
evaluated  in  these  experiments. 
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The  experiments  assumed  that  jobs  consist  of  messages  that  in  turn  are 
decomposed  into  packets.  A  packet  is  the  basic  unit  of  bits  in  the  communi¬ 
cations  part  of  the  experiments.  The  packet  size  was  set  at  2,000  bits. 

Several  other  parameters  that  had  to  be  determined  were  job  size  and  the  job- 
arrival  matrix.  The  job  arrival  matrix  has  as  its  (i,j)th  entry  the  number  of 
jobs  sent  from  i  to  j .  The  job  size  was  allowed  to  be  variably  set  at  one 
of  three  values-5,  10,  and  20  megabits.  It  was  recognized  that  some  com¬ 
binations  would  be  unrealistic.  The  job-arrival  matrix  was  first  set  for 
the  distributed  case  and  then  centralized  as  the  computer  configuration 
centralized.  To  derive  the  number  of  jobs  arising  from  a  given  city,  the 
proportion  of  city  population  to  the  total  population  in  all  cities  was 
multiplied  by  the  total  permissible  jobs.  The  creation  of  a  traffic  matrix 
was  based  upon  the  relative  distance  of  the  computing  centers  from  the  sourc 
cities.  In  the  completely  centralized  case,  the  job  load  was  distributed 
equally  among  the  centralized  computers.  Jobs  arising  locally  around  a 
centralized  computer  were  all  assigned  to  that  computer.  In  the  semi- 
distributed  case,  where  the  computers  were  dispersed  to  locations  about  the 
nation,  the  traffic  was  distributed  to  the  processing  centers  as  the  square 
root  of  the  distance  from  the  source  city,  normalized  to  the  sum  of  the 
square  root  distances  to  obtain  a  proportion  of  traffic.  For  the  completely 
distributed  case,  50%  of  the  jobs  arising  at  a  source  city  were  assigned  to 
the  computer  at  that  city.  The  remaining  jobs  were  distributed  among  all 
other  cities  by  the  square-root  distance  formula  used  above.  The  selection 
of  square  root  of  distance  was  based  on  reducing  loads  between  distant  cities 
aomewhat  but  not  to  an  excessive  degree.  Another  assumption  of  the  message 
traffic  was  the  allowance  for  acknowledgment  messages. 

The  above  framework  established  a  set  of  81  distinct  experiments  in  which 
three  veluea  of  each  of  the  following  parameters  were  set:  Job  size, 
configuration,  job  type,  end  core  size.  Total  cost  of  the  network  configura¬ 
tion  varied  from  3.1  to  8.3  million  dollars  per  month. 
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It  should  be  not  2d  that  many  additional  runs  were  performed  to  set  up  the 
network  configuration  and  its  parameters.  In  particular*  experimental  runs 
were  required  in  configuring  the  network  topology  itself  and  i r  setting 
parameters  to  obtain  feasible  response  times.  Here  response  time  is  infinite 
and  infeasible,  since  it  is  impossible  to  process  the  jobs  in  a  day. 

The  absolute  level  of  work  was  based  initially  upon  an  estimated  70%  utiliza¬ 
tion  rate  of  the  raw  processing  power  (i.e.,  total  megabits  modified  per 
second  by  all  computers)  of  the  system.  This  total  utilization  level  was 
adjusted  during  the  experimental  runs  to  reflect  the  reduced  throughput 
resulting  from  the  assumptions  concerning  job  characteristics. 

The  assumption  that  jobs  arise  in  proportion  to  population  has  been  made  in 
previous  network  analysis  reports  in  connection  with  message  traffic.  Depen¬ 
dence  on  the  populations  of  cities  at  both  ends  of  the  link  can  reflect  the 
difference  in  computing  power  for  major  centers.  The  results  of  the  first 
set  of  experiments  are  examined  in  Section  2.2. 

2.1.2  Experiment  Set  II 

The  second  set  of  experiments  was  focused  on  the  configuration  and  communica¬ 
tions  aspects  of  networking.  In  the  first  set  of  experiments,  the  network 
topology  was  fixed,  and  computation  and  message  properties  varied.  Here,  the 
goal  was  to  configure  a  network  based  on  articulation,  reliability,  and  cost- 
effectiveness.  By  following  several  policies  of  link  deletion  from  a 
completely  connected  configuration,  the  most  cost-effective  topology  was 
derived  under  various  c  nstraints  of  articulation  level.  The  cost-effective¬ 
ness  measures  of  the  rc  tltii  configurations  are  compared  in  Section  2.3 
with  those  of  ring  and  star  topologies. 

The  framework  of  these  experiments  was  more  restrictive  than  that  of  the 
first  set.  Eight  nodes  were  selected  in  the  following  cities: 
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San  Francisco 
Los  Angeles 
Chicago 
Detroit 


Boston 
New  York- 
Philadelphia 
Washington,  D.  C. 


Initially,  a  fully  connected  network  topology  was  assumed,  so  that  any  two 
sites  could  communicate  directly.  Line  capacity  was  set,  as  before,  at  50kb. 
No  computation  v?as  done  at  any  node,  so  that  the  computation  processing 
characteristics  were  deleted. 


The  message  size  was  set  at  2,000  bits,  and  the  packet  9ize  at  1,000  bits. 

No  acknowledgment  messages  were  assumed.  Two  job-arrival  matrices  were  formed 
on  two  bases.  The  first  was  the  distance-population  formula  of  the  first 
set  of  experiments.  The  second  was  a  symmetric  traffic  matrix  where  an  equal 
number  of  jobs  was  sent  between  any  two  nodes.  The  experimental  results  are 
discussed  in  Section  2.3. 


2.2  EXPERIMENTAL  RESULTS:  SET  I 

The  basic  experiments  varied  the  concentration  of  computing  power,  job  type, 
core  size,  and  job  size.  Inference  can  be  made  concerning  the  relationship 
betvreen  system  input  parameters  and  network  performance  measures.  The  basic 
network  performance  measures  are  cost,  response  time,  job  throughput,  and 
measures  of  cost-effectiveness. 


Costs  included  the  entire  monthly  costs  associated  with  the  computer 
and  communication  system  hardware,  including  the  computer  peripherals 
and  memory  units,  communication  interface  units  (modems,  switches,  con¬ 
centrators,  etc.),  initial  coots  of  the  central  processing  units,  and 
direct  channel  costs. 
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Response  time  was  defined  as  the  mean  time  for  both  computation  and 
communication  processing,  often  involving  the  averaging  of  combinations 
of  times  from  several  computation  nodes  and  communication  links.  How¬ 
ever,  initial  generation  and  output  distribution  times  were  ignored; 
only  network  times  were  considered. 

Throughput  was  defined  as  the  number  of  jobs  processed  per  day,  modified 
by  considerations  of  job  and  message  size.  Throughput  is  thus  the  work¬ 
load  of  the  system,  rather  than  system  capacity. 

The  principle  measure  of  cost-effectiveness  was  the  quantity  throughput 
per  dollar  per  unit  response  time.  That  is,  the  total  throughput 
(number  of  jobs  times  job  size)  was  divided  by  the  product  of  total 
monthly  cost  and  the  mean  total  response  time. 

In  interpreting  the  results,  the  fact  that  several  network  configuration  and 
job  characteristic  parameters  were  held  constant  needs  to  be  kept  in  mind. 
Total  computation  and  communication  capacity  was  held  roughly  constant  for  all 
configurations  (the  capacity  for  computation  was  set  at  250  million  instruc¬ 
tions  per  second,  while  the  line  capacities  were  fixed  at  50kb  for  communica¬ 
tion.  The  network  topology,  except  for  the  distribution  of  computing 
capacity,  was  fixed  (as  described  in  the  previous  section).  Although  central 
memory  capacity  was  varied,  all  other  aspects  of  the  computer  facility  con¬ 
figuration  were  held  constant  for  a  given  computer.  Hence,  cost  varied  with 
the  constellation  of  computing  processors  and  memories  used,  but  not  with 
other  (fixed)  aspects  of  the  computer  or  communication  configuration. 

The  results  of  the  81  experimental  runs  are  given  in  Table  2-5  in  a  nested 
arrangement.  The  code  for  concentration,  given  in  column  (1),  is: 

D — Distributed 
S — Semi-centralized 
C — Centralized 


I 


1  September  1972 


TABLE  2-5 

RESULTS  OF  EXPERIMENTS 


(1) 

10NCEN- 

(2) 

JOB 

(3) 

CORE 

(4) 

(5) 

5  MB 
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23 
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24 
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02439 
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(7) 

(8) 

(9) 

IB 

20 

MB 

C.E.xlO"' 

!  EXPT.  NO. 

C.F,.x10~2 

20034 

55 

18826 

13743 

56 

13367 

11506 

57 

•  11400 

08730 

58 

04070 

06242 

59 

05581 

05259 

60 

05297 

04719 

61 
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03032 

62 

03200 

02382 

63 

02929 

41341 

64 

43268 

38493 

65 

41003 

35030 

66 

37785 

15259 

67 

15740 

13754 

68 

15799 

12293 

69 

14561 

05247 

70 

05188 

04540 

71 

04751 

03983 

72 

04191 

35514 

73 

44019 

36931 

74 

42027 

33491 

75 

38843 
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76 

15553 
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77 

16447 

11498 

78 

15301 

03978 

79 

0 

03429 

80 

06173 

03003 

81 
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The  amount  of  core  memory  associated  with  each  main  processor  is  given  in 
megabytes  in  column  (3).  Columns  (4),  (6)  and  (8)  give  the  experiment  number, 
and  columns  (5),  (7),  and  (9)  the  cost-ef f ect'iveness  (CE)  index  for  each  of 
the  81  experimental  runs.  These  sets  correspond  to  the  three  job  sizes,  set 
one  for  job?  of  5  megabits  (ME),  set  two  for  jobs  of  10  MB,  and  set  _hree  for 
jobs  of  20  MB.  (Note  that  at  the  20-MB  job  size,  two  experimental  runs— numbers 
61  and  79 — exceeded  the  capacity  of  the  system— me.in  and  response  time  was  very 
long — and  a  zero  cost-effectiveness  index  is  indicated.)  Cost-effectiveness  is 
defined  here  as  the  square  root  of  the  throughput  divided  by  the  product  of  (a) 
total  network  costs  squared  and  (b)  the  mean  total  response  time. 

Although  some  caution  mu3t  be  exercised  in  interpreting  the  results  since  only 
a  few  of  the  myriad  possible  variables  and  combinations  of  variables  were 
manipulated,  some  conclusions  seem  clear.  For  instance,  the  productivity  of 
a  particular  configuration  is  related  in  a  direct  and  dramatic  fashion  to  the 
degree  of  CPU  utilization  that  is  achieved.  The  cost-effectiveness  index 
falls  drastically  as  the  fraction  of  time  in  computation  changes  from  90% 
(scientific)  to  50%  (mixed)  and  10%  (commercial).  While  tho  inefficiency  of 
I/O-bound  jobs  is  generally  accepted  in  the  computation  and  communication 
field,  that  the  interrelationship  should  be  so  severe  was  not  entirely  expected. 
Although  interleaving  and  time-sharing  of  jobs  may  do  much  to  alleviate  the 
inefficiency,  it  would  appear  that  data-handling  techniques  may  have  greatest 
promise  for  technological  payoff  in  the  future. 

A  similar  instance  is  the  growth  of  productivity  with  computing  load  (job 
size,  in  this  case).  The  cost-effectiveness  index  continues  to  increase 
until  the  system  or  processor  becomes  saturated,  after  which  it  begins  to 
fall  off  rapidly.  In  other  contexts,  the  efficiency  of  channel  utilization 
in  terms  of  response  time  has  been  found  to  decline,  depending  upon  a  variety 
of  circumstances,  in  the  70%-90%  channel-utilization  points.  While  there  are 
not  enough  data  points  in  this  study  to  make  such  fine  distinctions,  the 
general  premise  is  supported.  Since  the  relationship  between  load  and  response 
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efficiency  is  well  established,  the  experiments  were  designed  not  to  reaffirm 
it  but  to  explore  some  of  the  interrelationships  between  job  size  and  memory 
size  and  the  resulting  cost-effectiveness. 

The  relationship  between  job  size  (load)  and  memory  size  seems  reasonably 
clear.  The  efficiency  o'  core-storage  additions  increases  as  the  load  on  the 
processor  increases.  That  is,  from  our  data,  at  a  relatively  light  load  (jobs 
of  five  MB)  small  core  is  more  cost-effective;  at  relatively  high  loads  (jobs 
of  20  MB),  larger  core  becomes  more  cost-effective.  The  shift  is  a  gradual  one, 
however;  even  with  loads  large  enough  to  swamp  computers  with  relatively  small 
core  sizes,  a  moderate  core  is  more  cost-effective  than  a  very  large  one.  This 
is  revealed  in  Figure  2-3,  which  graphs  job  size  versus  cost-effectiveness  for 
various  core  sizes.  Further  investigation  of  this  relationship ,  of  memory  mix, 
of  parallelism,  and  of  pipelining  seems  desirable.  However,  these  would  probe 
somewhat  more  deeply  into  processing  configurations  than  is  desirable  for  general 
networking  applications.  It  might  be  pointed  out  that  system  response  time 
continues  to  improve  with  increasing  core  sizes.  It  is  the  disproportionate 
cost  of  extra  core  (in  comparison  to  the  increase  in  cost-effectiveness)  chat 
inhibits  the  strength  of  the  relationship. 

Of  major  interest  in  this  investigation  is  the  relationship  between  distribu¬ 
ted  and  centralized  computing.  The  difference  in  relative  concentration  of 
computing  power  between  the  completely  centralized  and  the  semi-distributed 
cases  lies  in  the  distribution  of  large  computers  by  location  around  the  U.  S. 
That  is,  the  centralized  case  does  not  use  one  super-powerful  computer,  but 
a  concentration  of  12  very  large  ones,  a  situation  that  is  duplicated  with 
different  locations  in  the  semi-distributed  case.  The  data  clearly  support 
the  hypothesis  of  economy  of  scale;  large  computers  are  much  more  cost- 
effective  than  smaller  computers  (although  all  computers  considered  in  this 
study  were  quite  large)  and  especially  so  as  the  load  becomes  high.  That  is, 
with  high  load  into  a  constant-capacity  net,  the  smaller  computers  suffered 
in  comparison  with  the  larger.  Similar  results  have  been  found  in  communication 
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Figure  2-3.  Memory  Size  Cost-Effectiveness 
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channels,  where,  for  constant  capacity,  one  large  line  has  been  found  more 
efficient  than  a  bundle  of  smaller  ones  insofar  a9  throughput  is  concerned. 

For  response  time,  there  is  some  evidence  that  multiple  channels  or  processors 
yield  better  results  in  speed  but  not  necessarily  in  terms  of  cost-effective¬ 
ness  . 

From  the  experimental  results,  the  semi-dlatributed  configuration  appears  more 
cost-effective  than  the  completely  centralized  case,  partially  as  a  result  of 
shorter,  and  hence  more  quickly  responding,  communication  lines.  No  attempt 
was  made  to  adjust  communication  network  capacity  or  cost  to  accommodate 
differences  in  the  traffic  distribution  under  the  various  cases.  It  is  quite 
poesible  that  further  fine-tuning  to  reduce  costs  could  have  been  found,  or 
that  a  better  allocation  of  the  capacity  might  have  been  found.  An  anomalous 
situation  does  arise  in  the  data  for  larg?.  jobs  (20  MB)  in  that  the  centralized 
case  seems  more  cost-ef fee  live.  This  is  probably  due  to  the  fact  that  the 
bulk  of  the  job  traffic  arises  in  the  Eastern  cities,  and,  by  moving  compute/s 
away  from  the  central  moment  of  traffic  sources,  costs  have  been  increased. 
Further  investigation  of  this  aspect  of  network  optimization  is  indicated. 

Frank  and  Frisch  (1971)  and  Martin  ( j. 972)  have  indicated  approaches  to  the 
problem  for  communication  nets.  These,  in  conjunction  with  resource-alloca¬ 
tion  algorithms,  should  provide  fairly  ready  answers  to  optimal  location  of 
processing  centers. 

There  are,  of  course,  several  other  arguments  against  complete  centralization 
besides  relative  cost-effectiveness  for  throughput.  The  most  relevant  of  these 
is  the  relative  vulnerability  of  a  completely  centralized  facility  to  the 
effects  of  iailure  of  processing  or  transmission  equipment  (reliabilitv  impacts), 
environmental  effects  such  as  blackouts  or  brownouts  of  electrical  power, 
inclement  weather,  sabotage,  or  hostile  action.  On  the  other  hand,  larger 
processors  frequently  have  other  advantages,  such  as  faster,  more  powerful 
peripheral  equipment  as  well  as  superior  and  more  powerful  central  processing 
units,  instruction  repertoires,  and  memories.  Larger  computers  often  have 
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more  powerful  operating  systems  and  programming  languages  available  to  them. 
Fnrility  operation  and  maintenance  cost?-  of  a  few  central  facilities  will  (or 
may)  also  be  lower,  replacement  parts  are  more  easily  handled,  and  record 
keeping  and  administration  are  made  easier.  Nonetheless,  the  opportunities 
for  load  balancing,  the  superior  failsafe  capabilities  of  alternative  loca¬ 
tions,  and  the  opportunity  for  specialization- of  some  computers  for  specific 
kinds  of  jobs  with  resultant  increases  in  efficiency  seem  to  bolster  our 
general  finding  that  semi-distributed  computing  provides  a  superior  operation. 

2.3  EXPERIMENTAL  RESULTS:  SET  II 

The  second  set  of  experiments  was  focused  on  the  near-optimal  design  of 
computer  networks.  The  general  aim  embodied  deriving  criteria  for  assisting 
in  automatically  generating  cost-beneficial  network  configurations.  Another 
goal  was  to  find  the  sensitivity  between  optimization  with  respect  to  topology 
and  the  communications  traffic  input  data  as  well  as  the  performance  criteria. 

Recall  from  Section  2.1.2  that  the  configuration  was  an  eight-center  network 
with  equal-capacity  lines.  Costs  were  based  on  line  costs,  and  reliabilities 
were  based  on  time  reliabilities.  No  acknowledgement  messages  were  allowed. 
The  beginning  topology  for  all  experiments  was  a  completely  connected  one  in 
which  there  was  a  direct  connection  between  every  two  centers.  Links  were 
removed  individually  in  a  sequential  manner.  Two  measures  of  cost-effective¬ 
ness  were  employed.  One  ds  based  on  throughput  factored  by  cost.  The  curves 
in  Figure  2-4  are  based  on  this  criterion.  In  this  figure,  cost  is  graphed 
versus  bits/second  per  dollar  cost. 

Figure  2-5  is  based  on  the  cost-effectiveness  measure  of  response  time.  Here 
the  workload  is  a  constant,  so  that;  cost  is  plotted  versus  a  constant  (10^) 
divided  by  the  product  of  mean  total  response  time  and  monthly  cost. 
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MINIMAL  SPANNING  TREE  CONFIGURATION  BASED  ON  CEOGRAPHY 


1  i  4  a 


1  6 


CONFIGURATION  BASED  ON  THE  HEAVIEST  COMMUNICATING  NEIGHBORS 


Figure  2-6.  Standard  Configurations 
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(a)  most  throughput  cost  effective  WITH  ARTICULATION  LEVEL  2  -  POINT  A 


a  * 


(b)  MOST  RESPONSE  TIME  COST  EFFECTIVE  WITH  ARTICULATION  LEVEL  2  --  POINT  B 


i  7 


(c)  MOST  COST  EFFECTIVE  WITH  ARTICULATION  LEVEL  1  -  LINK  REMOVAL  BY  COST 
EFFECTIVENESS  --  POINT  C 


i 

i 


<d> 


i  *  i 

MOST  COST  EFFECTIVE  WITH  ARTICULATION  LEVEL  1  -  LEAST  LOAD  LINK 
REMOVAL  -  POINT  0 

Figure  2-7.  Most  Cost-Effective  Configurations 
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The  topology-determination  process  of  successive  link  removal  through  cost- 
effectiveness  may  be  compared  to  the  results  achieved  by  a  priori  network 
specification.  A  topology  can  be  prespecified,  and  then  the  links  can  be 
constructed  on  the  basis  of  either  geographic  or  message-traffic  consider- 
tions.  Two  common  topologies  are  a  ring  and  a  star.  A  ring  is  the  configura' 
tion  with  a  minimum  number  of  links  that  gives  articulation  level  two,  while 
a  star  has  a  central  site  and  all  other  sites  are  connected  only  to  the 
central  site.  Another  method  is  to  use  minimal  spanning  trees.  Distance  or 
link  values  could  be  based  on  geography  or  traffic.  Yet  another  method  is  to 
link  each  center  to  the  two  other  centers  with  which  it  has  the  highest  mes¬ 
sage  traffic.  These  standard  configurations  are  displayed  in  Figure  2-6,  and 
their  cost-effectiveness  is  given  in  Figures  2-4  and  2-5,  using  the  following 
symbols: 


ring — traffic  § 

ring — geography  0 

star — center  at  traffic  center  * 

min.  spanning  tree — traffic  Y. 

min.  sponning  tree — geography  Y 

heaviest  communicating  + 

neighbors 


In  Figures  2-6  and  2-7,  the  following  numerical  code  for  the  cities  is  used. 


1  Los  Angeles 

2  San  Francisco 

3  Detroit 

4  Chicago 


5  Washington,  D.C. 

6  Philadelphia 

7  New  York 

8  Bouton 
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The  curves  ir  Figures  2-4  and  2-5  are  labeled  according  to  the  follow¬ 
ing  code.  . 


Label 

© 

© 

© 

© 

© 

© 


Description 

Link  removal  by  cost-effectiveness,  articulation  level  1 
Link  removal  by  cost-effectiveness,  articulation  level  2 
Link  removal  by  least  loads,  articulation  level  1 
Link  removal  by  least  loads,  articulation  level  2 
Uniform  traffic  link  removal,  articulation  level  1 
Uniform  traffic  link  removal,  articulation  level  2 


Link  removal  by  cost-effectiveness  means  that,  at  each  step,  the  link  that  is 
least  cost-effective  in  terms  of  throughput  per  unit  cost  is  removed.  Articu¬ 
lation  level  2  means  that  the  constraint  was  applied  that  the  network  had  to 
possess  link  articulation  level  2  at  each  stage  of  link  removal,  including  the 
final  network. 


Another  method  of  link  removal  was  based  on  removal  of  the  least-loaded  link 
at  a  given  stage. 

Uniform  traffic  link  removal  refers  to  the  removal  of  the  least  loaded  link 
based  on  uniform  traffic  statistics.  However,  cost-effectiveness  and  the 
graph  values  are  based  on  traffic  which,  is  dependent  upon  population  and  diS' 
tance. 


In  Figures  2-4  and  2-5,  the  points  at  which  some  of  the  curves  attain  maximum 
value  are  of  interest.  Some  of  these  have  been  labeled,  and  the  corresponding 
configurations  are  given  in  Figure  2-7. 

Two  conclusions  are  evident.  First,  note  that  topology  determination  based  on 
cost-effective  and  leas'c-loaded  considerations  yield  far  better  throughput 
cost-ef f ectiveness  (see  Figure  2-4)  than  any  standard  topology  assumed  a  priori. 
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However,  also  note  that  assuming  a  uniform  traffic  matrix  does  no  better — in 
fact,  poorer— than  standard  least-cost  topologies.  Hence,  although  superior 
when  good  information  exists  on  traffic  loads,  the  methodology  is  quite 
sensitive  to  inaccuracies  in  traffic  estimates.  Quite  similar  results  are 
achieved  for  response-time  effectiveness,  except  that  ranking  node  intercon¬ 
nections  on  probable  loaa  levels  yields  fairly  good  results  (see  points  Y  for 
a  minimally  articulated  net  and  +  for  ,a  two-articulated  net  in  Figure  2-5) . 
Ihus,  if  traffic  estimates  are  only  good  enough  to  establish  probable  ranks, 
that  information  will  still  yield  better  cost-effective  performance  than 
ignoring  traffic  patterns  altogether. 

The  second  conclusion  is  that  the  optimum  configuration  depends  on  the  cost- 
effectiveness  measure  and  the  link-removal  process.  This  is  supported  in 
the  figures  by  comparing  the  maximum  value  with  the  constraint  of  articulation 
level  two.  In  Figure  2-4,  this  point  is  labeled  A  and  is  attained  by  link 
removal  based  on  cost-effectiveness.  In  Figure  2-5,  however,  the  most  cost- 
e  fective  point  is  a  different  point  (lalilled  B)  and  is  obtained  by  the  least¬ 
load  link-removal  process.  These  two  configurations  are  quite  different,  as 
is  revealed  by  comparing  Figure  2-7  (a)  and  2-7  (b).  Since  point  A  in  Figure 
2  4  (point  B  in  Figure  2-5)  corresponds  to  optimization  by  throughput  (response 
time  in  Figure  2-5),  throughput  and  response  time  are  not  necessarily  optimized 
in  the  same  configuration. 

To  consider  different  link-removal  methods  with  or  without  the  link  articula¬ 
tion  level  being  two,  it  is  sufficient  to  examine  either  Figure  2-4  or  Figure 
2-5.  The  sensitivity  of  the  maximum  value  attained  to  the  link-removal  method 
is  resolved  in  the  distinct  curves  and  points  where  a  maximum  value  is  attained. 
Thus,  it  would  be  necessary  to  have  several  methods  of  lin*.  removal  on  hand  for 
an  automated  process. 

Another  conclusion  is  that  all  optimization  removal  procedures  depend  on  the 
traffic  statistics  and,  in  particular,  the  job-arrival  matrix,  which  gives  the 
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number  of  jobs  being  sent  between  any  two  centers.  This  can  easily  be  seen  in 
both  Figure  2-4  and  2-5  by  examining  the  uniform-traffic  link-removal  curves. 

In  both  figures,  the  curves  follow  a  zig-zag  path  and  are  dominated  by  the 
least-loaded  link-removal  cases.  This  is  due  to  the  experiments  being 
structured  by  one  value  of  a  parameter  and  evaluated  on  the  basis  of  another 
value  of  the  same  parameter.  This  emphasizes  the  importance  of  accurate 
message  statistics  in  the  network  design  process.  It  also  points  out  the 
importance  of  message  statistics  on  network  performance  and  measurement. 

As  might  be  expected,  the  most  cost-effective  networks  are  those  that  have 
articulation  level  one.  This  is  seen  in  both  Figures  2-4  and  2-5,  where 
points  C  dominate  points  A  and  B.  Furthermore,  the  configuration  in  Figure  2-4 
and  the  traffic  statistics  based  on  population  and  distance  reveal  that  the 
most  cost-effective  network  configurations  are  those  that  are  one-connected 
in  remote  or  light  traffic  areas. 

However,  although  level-one  networks  are  more  desirable  from  a  cost-effective¬ 
ness  point  of  view,  they  are  not  desirable  when  reliability  is  a  consideration. 

A  measure  of  reliability  is  the  expected  number  of  node  pairs  that  will 
communicate.  The  failure  rate  can  then  be  defined  as  the  probability  that  a 
pair  of  nodes  will  not  be  able  to  communicate  at  a  given  time.  With  these 
definitions,  consider  the  example  of  Figure  2-8.  In  the  first  network,  there 
are  13  links,  the  articulation  level  is  two,  end  the  failure  rate  is  .00012. 
However,  in  the  second  network  the  number  of  links  is  12,  the  articulation 
level  is  one,  and  the  failure  rate  is  .00681.  In  this  particular  example,  by 
deleting  the  link  from  node  2  to  node  7,  the  failure  rate  increased  by  a 
factor  of  57.  This  is  just  one  example  of  the  articulation  level.  The  exact 
dependence  and  increase  in  failure  rate  would  depend  on  the  network  structure. 

To  summarize  this  section,  we  note  that  articulation  level  one  is  desirable 
but  not  possible  from  a  reliability  point  of  view.  Furthermore,  a  link- 
removal  method  such  as  is  described  here  is  better  than  a  standard  configuration 
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Figure  2-8.  Sample  Networks—Keliability  and  Articulation  Level 
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because  it  is  more  sensitive  to  the  structure  of  the  network.  However,  to 
find  the  most  cost-effective  configuration,  • everal  methods  of  link  elimination 
must  be  tried.  When  performing  the  analysis,  statistics  as  close  to  the 
estimated  or  actual  traffic  as  possible  must  be  used  for  good  results. 
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3. 

SYSTEMS  ANALYSIS 

3.1 

INTRODUCTION 

This  section  examines  several  systems  analysis  tasks  related  to  existing  and 
projected  computer  networks.  The  purpose  of  these  experiments  was  several¬ 
fold.  First,  the  network  analyses  could  be  used  to  validate  the  CACTOS  model, 
especially  on  the  communication  scale.  Second,  some  trade-off  relat  onships 
could  be  discerned.  Third,  through  these  experiments,  the  CaCTOS  work  could 
become  relevant  to  the  near-term  planning  objectives  of  DoD  agencies.  Another 
benefit  of  the  analyses  was  to  make  known  some  of  the  analytical  methods 
necessary  for  a  quantitative  system  view. 

The  analyses  ranged  from  an  existing  Marine  Corps  manpower  system  (JUMPS/MMS) 
to  a  projected  system  for  the  Air  Force  Logistics  Command  and  a  modified  ver¬ 
sion  of  the  projected  GSA  network. 

3.2  MARINE  CORPS  PERSONNEL  SYSTEM1 

The  Marine  Corps  Joint  Uniform  Military  Pay  System/Manpower  Management  System 
(JUMPS/MMS)  is  centered  in  the  Marine  Corps  Automated  Service  Center  (MCASC) 
in  Kansas  City,  Missouri,  with  satellite  Data  Processing  Installations  (DPIs) 
at  seven  Marine  Corps  bases  in  the  continental  United  States  and  overseas. 
(Initial  simulation  ru.s  were  made  using  data  from  eight  locations,  but  the 
DPI  in  Danang,  Vietnam,  has  since  been  phased  out.) 

The  goal  of  JUMPS/MMS  includes  the  improved  management  of  manpower  appropria¬ 
tion  and  distribution.  The  details  of  the  system  are  described  by  Willmorth 
[2].  The  network  is  shown  in  Figure  3-1,  along  with  the  basic  computer  and 
AUTODIN  connections.  The  rae.in  center  is  at  Kansas  City  at  the  MCASC.  The 


This  part  of  the  CACTOS  project  is  deeply  indebted  to  United  States  Marine 
Corps  personnel,  especially  Colonel  J.  Marsh  and  Lt .  Col.  V.  Albers. 
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network  operates  in  a  batch  mode.  Problems  that  come  about  are  due  to  the 
priority  of  personnel  message  traffic  in  AUTODIN  and  at  the  Marine  Corps  sites. 


The  analyses  for  the  JUMPS/MMS  system  were  aimed  at  several  goals.  First, 
the  personnel  system  could  be  used  to  validate  the  CACTCfJ  model.  This  was 
accomplished  successfully.  The  model  and  actual  operation  statistics  differed 
by  less  than  5%.  The  details  are  described  by  Willmorth  [2]. 


The  second  objective  was  to 


evaluate  the  network  in  terms  of  improving  response 


time.  The  response  cycle  in  processing  a  change  through  the  network  was  5  to 
10  days  on  the  average.  Analyses  were  performed  that  showed  a  5-10% 
reduction  in  time  delay  by  moving  part  of  the  data  base  from  Kansas  City  to 
a  USMC  training  center.  The  data  on  personnel  in  training  prior  to  duty 
assignment  would  then  be  maintained  outside  of  the  home  base  of  the  system. 

In  this  example,  because  of  queueing  delays  in  AUTODIN  and  at  Kansas  City, 
load  slaving  of  file  updating  did  not  significantly  improve  the  system  per¬ 
formance,  within  the  constraints  of  circuit  switching  and  priority 


levels. 


A  third  set  of  analyses  was  performed  to  determine  the  effects  of  increas¬ 
ing  the  priority  level  of  some  or  all  of  the  personnel  traffic.  This  is  shown 
in  Table  3-1.  In  this  table  the  response  times  for  the  network  are  given  for 
10%,  20%,  and  30%  of  the  personnel  messages  having  a  higher  priority.  This 
would  be  the  case  in  an  emergency  or  exercise  deployment  of  Marine  Corps 
personnel  and  might  occur  periodically  in  restaffing  and  reassignment. 

Analyses  were  also  run  to  determine  the  effects  of  a  dedicated  personnel 
network  with  the  present  configuration  at  each  USMC  site.  The  results  are 
given  in  Table  3-2.  (The  shared  column  assumes  a  90%  load  factor.)  In  this 
table,  the  single  message  response  time  decreases  by  over  half.  Total  response 
for  the  network  and  MCASC  experienced  a  similar  reduction.  This  indirectly 
shows  the  multiple  effects  of  message  switching  and  higher  priority  levels 
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TABLE  3-1 

PRIORITY  DIVISION  OF  JOBS 


is 


■N, 


(Hours :Minute3) 

MCASC 

Net  Average 

PRESENT  LOAD 

Communication 

19:25 

19:25 

Computation 

23:3* 

27:04 

Total  Response 

42:59 

46:29 

PRIORITY  HANDLING 

10%  PRIORITY  JOBS 

Priority  Messages 

:  20 

:20 

Priority  Jobs 

2:46 

2:50 

Priority  Respoxise 

3:06 

3:10 

Remain  Messages 

19:05 

19:05 

Remain  Jobs 

20:22 

21:16 

Elapsed  Time 

t 

42:33 

43:31 

20%  PRIORITY  JOBS 

Priority  Messages 

:  38 

:38 

Priority  Jobs 

4:45 

4:02 

Priority  Response 

5:23 

4:40 

Remain  Messages 

18:57 

18:57 

Remain  Jobs 

17:57 

18:06 

Elapsed  Time 

42:17 

41:43 

30%  PRIORITY  JOBS 

Priority  Messages 

1:16 

1:16 

Priority  Jobs 

7:22 

6:09 

Priority  Response 

8:38 

7:25 

Remain  Messages 

17:27 

17:27 

Remain  Jobs 

15:40 

15:40 

Elapsed  Tiae 

41:45 

40:32 

J 
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TABLE  3-2 

OPERATING  TIMES  FOR  DEDICATED  VS.  SHARED  OPERATIONS 
(Hours ’.Minutes) 


Dedicated 

Shared 

MMS  Only 

90%  Load 

Single  Message 

:  11 

:37 

All  Messages 

1:10 

19:25 

MCASC  Average 

Single  Job 

:  31 

1:33 

All  Jobs 

20:42 

23:34 

Network  Average 

Single  Job 

1:06 

3:59 

All  Jobs 

16:50 

27:04 

Total  Response 

MCASC 

21:52 

42:59 

Network 

18:00 

46:29 
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for  the  traffic.  In  terms  of  cost  analysis  in  a  dedicated  system,  communica¬ 
tions  costs  including  modems  would  be  between  10%  and  15%  of  the  total  network 
cost.  The  exact  figure  would  depend  on  the  terms  and  conditions  of 
existing  hardware  and  dedicated  communication  lines. 

To  conclude  this  subsection,  we  note  that  the  model  was  validated  using  actual 
AUTODIN  statistics.  Secondly,  load  sharing  within  the  present  environment 
produced  only  marginal  improvement.  Third,  a  dedicated  message  switching 
system  or  higher  priorities  in  AUTODIN  produce  increases  in  performance  by 
substantially  reducing  response  time. 

3.3  ADVANCED  LOGISTICS  SYSTEM 

The  Marine  Corps  Personnel  System  is  a  centralized  network.  In  contrast,  an 
analysis  was  performed  on  a  planned  decentralized  configuration  of  the  AFLC 
Advanced  Logistics  System  (ALS) .  Analysis  hete  was  aided  by  the  forecasted 
traffic  loads  developed  by  Turhaly  and  Palmer  [l],who  conducted  transmission 
simulation  on  individual  lines.  Whereas  the  Marine  system  was  based  on  current 
switching,  the  ALS  was  focused  on  message  switching. 

This  study  considered  the  ALS  in  a  general  analytic  framework  wherein  the  six 
data  centers  were  considered  nodes  in  the  network.  The  results  described 
below  in  terms  of  response  time  are  probably  low,  owing  to  the  omission  of 
message  processing  devices.  The  six  bases  considered  were  WPAFB,  WRAMA,  OCAMA, 
SCAMA,  OOAMA,  and  SAAMA.  The  configurations  evaluated  included  those  of 
Turhaly  and  Palmer  and  a  ring  configuration.  Distances  were  obtained  from 
Great  Circle  distance  grids. 

Channel  capacities  were  based  on  anticipated  loads  reduced  to  accommodate 
header  messages.  Acknowledgement  traffic  was  allowed.  The  use  of  fixed-path 
routing  increased  some  of  the  capacity  assignments  above  the  estimate  in  the 
ALS  study.  The  capacities  are  given  in  Table  3-3. 
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TABLE  3-3 

CHANNEL  CAPACITIES 


1  WPAFB 

4 

SMAMA 

2  WRAMA 

5 

00AMA 

3  OCAMA 

6 

SAAMA 

Route 

Capacity  (KB) 

Route 

Capacity 

ring 

(KB)  Route 

Capacity  (KB) 

1-3 

12.0 

3-5 

19.2 

2-6 

12.0 

1-2 

12.0 

4-5 

9.6 

4-6 

9.6 

wheel 

1-2 

9.6 

3-4 

7.2 

2-6 

7.2 

1-3 

7.2 

3-5 

4.8 

4-5 

4.8 

2-3 

4.8 

3-6 

9.6 

4-6 

7.2 

star 

1-3 

19.2 

5-3 

9.6 

4-3 

19.2 

2-3 

9.6 

6-3 

19.2 

connected 

1-2 

2.4 

2-5 

1.2 

4-5 

1.2 

1-3 

2.4 

2-6 

4.S 

2-4 

1.2 

1-4 

3.6 

3-4 

4.8 

4-6 

2.4 

1-5 

2.4 

3-5 

3.6 

5  6 

2.4 

1-6 

4.8 

3-6 

7.2 

2-3 

3.6 
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TABLE  3-4 

RESPONSE  TIMES  FOR  PRIORITY  LEVELS  (ALS) 


Priority 


Response 


Message  Size 


Job  Size 


1 

0-10  min. 

56  char, 
inquiry 

396  char, 
response 

25 

2 

10-30  min. 

550  char. 

50 

3 

30  min 

550  char. 

50 

to  2  hr. 

4 

2  -  o  hr. 

550  char. 

50 

5 

over  6  hr. 

1200  char. 

1.25 
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TABLE  3-5 

RESPONSE  TIMES  (ALS) 


Response  Total  Percent 

Time  (Secs.)  Response  Time  Utilization 


Priority  1 

Computer 

1.160 

20 

Ring 

.260 

1.683 

4 

Wheel 

.365 

1.890 

4 

Star 

.234 

1.630 

4 

Connected 

.750 

>  2.661 

5 

Priority  2 

Computer 

2.907 

13 

Ring 

.674 

4.256 

4 

Wheel 

.907 

4.210 

5 

Star 

.492 

3.890 

4 

Connected 

1.550 

6.010 

5 

Priority  3 

Computer 

8.926 

36 

Ring 

.807 

10.541 

9 

Wheel 

1.043 

11.012 

13 

Star 

.610 

10.146 

9 

Connected 

1.758 

12.450 

12 

Priority  4 

Computer 

23.907 

60 

Ring 

.959 

25.824 

13 

Wheel 

1.210 

26.321 

16 

Star 

.715 

25.336 

12 

Connected 

2.256 

28.819 

19 

Priority  5 

Computer 

.331 

30 

Ring 

5.693 

11.537 

56 

Wheel 

5.821 

11.791 

54 

Star 

2.014 

4.023 

40 

Connected 

8.752 

17.652 

77 
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The  priority  classes,  response  times,  and  message  size  are  given  in  Table  3-4, 
along  with  the  respective  job  sizes  in  megabits.  The  latter  are  based  on  one- 
second  processing  of  on-line  queries  and  two-to-ten  seconds  for  a  batch  job. 

In  the  analysis,  job  and  message  interarrival  rates  were  set  equal.  Priority 
traffic  was  considered  in  a  declining  balance  scale. 

The  response  times,  using  the  CACTOS  model,  are  given  in  Table  3-5.  These  are 
the  times  needed  to  process  one  input  record.  Some  other  properties  considered 
included  the  average  number  of  links  traversed:  3  for  a  ring  net,  1.6  for  a 
star,  1.2  for  a  wheel,  and,  of  course,  2  for  the  completely  connected  case. 

In  Table  3-5,  lowest-priori ty  class  is  on  tape  files,  so  that  actual  response 
would  be  increased  over  the  numbers  given  for  a  full  tape.  Utilization  rates 
are  also  given  in  Table  3-5.  With  highest  total  capacity,  it  is  not  unexpected 
that  the  connected  net  has  the  highest  utilization.  The  increased  utilization 
of  the  computers  at  the  fourth-priority  level  indicates  that  the  system  is 
computation  bound.  In  terms  of  comparing  configurations,  the  wheel  appears  to 
be  the  most  suitable  in  terms  of  cost-effectiveness.  This  was  based  upon 
commercial  rates  and  the  distances  as  computed  between  centers.  Obviously,  the 
star  is  the  cheapest  when  backup  systems  are  ignored. 

In  summary,  then,  this  analysis  evaluated  several  alternative  configurations 
in  terms  of  performance  criteria.  Priority  levels  at  which  the  net  is  computa¬ 
tion  bound  were  determined.  Using  the  model,  trade-offs  v/ere  performed  on 
configuration  and  topology,  and  channel  capacities  were  computed. 

3.4  REQUIREMENTS  ANALYSIS  EXPERIMENTS 

Using  a  modification  of  the  Request  for  Proposal  issued  by  the  General  Services 
Administration,  the  CACTOS  project  undertook  the  task  of  performing  a  require¬ 
ments  analysis  to  determine  the  most  cost-beneficial  dedicated  network  con¬ 
figuration  given  the  environment  of  the  system.  The  goal  here  was  to  determine 
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the  best  percentage  of  fiscal  resources  in  communication  and  to  find  the 
best  mix  between  CPU  and  core-related  components  on  the  computation  side. 

The  GSA  system  was  envisioned  as  a  national  network  with  nodes  in  the  following 
cities  l 3] : 

Boston  Kansas  City 

New  York  Ft.  Worth 

Washington,  D.  C.  (2)  Denver 

Atlanta  Auburn,  Wash. 

Chicago  San  Francisco 

St.  Louis  Houston 

Huntsville 

» 

Core  requirements  were  specified  as  300  million  bytes.  The  experiments  assumed, 
for  a  response-time  threshold,  that  90%  of  messages  had  to  have  a  mean  total 
response,  time  of  10  seconds  or  Jess.  A  fiscal  ceiling  of  $fi 5 . 000  per  month  for 
system  operation  was  assumed.  The  "system  is  assumed  to  be  data-managemerit 
oriented,  so  that  the  major  part  of  the  processing  is  1/0  related. 

The  purpose  of  the  analyses  was,  first,  to  determine  the  most  cost-beneficial 
combination  of  communication  and  computation  costs,  in  terms  of  throughput  per 
unit  cost.  The  second  phase  was  to  then  determine  the  percentage  of  resources 
devoted  to  CPU  versus  core-storage- related  components.  The  analysis  results 
revealed  the  relationship  between  throughput  and  percentage  of  fiscal  resources 
in  communications  shown  in  Figure  3-2,  Thrqughput  for  Figures  3-2  and  3-3  is 
measured  in  multiples  of  the  job  arrival  matrix  (constant  x  number  of  jobs). 

The  optimal  percentage  is  less  than  1.0%,  which  is  consistent  with  other  experi¬ 
ments  in  dedicated  systems. 

In  Figure  3-3,  throughput  is  graphed  versus  percentage  of  resources  in  CPU. 
Throughput  for  Figures  3-2  and  3-3  is  measured  in  multiples  of  the  job-arrival 
matrix  (constant  x  number  of  jobs)  after  the  communications  expenses  of  9.3  have 
been  removed  from  the  fiscal  threshold.  Several  graphs  are  given  for  various 
percentages  of  job  division  between  1/0  and  CPU  (10%,  25%,  50%  CPU).  These 
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THROUGHPUT 


5  Lu  IS 

PERCENTAGE  OF  FISCAL 
RESOURCES  IN  COMMUNICATION 


Figure  3.7.  Percentage  in  Communication 


PERCENTAGE  OF  COMPUTATION 
FISCAL  RESOURCES  IN  CPU 


Figure  3.3.  Percentage  in  CPU 
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reveal  that  even  with  an  I/O-oriented  system,  the  amount  of  resources  in  CPU 
should  be  high  relative  to  core-related  components.  In  this  case,  because  only 
truly  attainable  configurations  are  being  considered,  the  maximum  feasible 
percentage  in  CPU  is  62.3%.  The  reason  for  this  high  percentage  related,  in 
part,  to  the  fiscal  and  response  time  boundaries  and  also  to  the  dependence  of 
throughput  on  the  CPU-related  parameters. 

Using  IBM  third-generation  hardware  as  an  example,  the  optimal  configuration 
consists  of  360/65  machines  in  Denver  and  at  one  Washington  site,  with  360/20 
machines  at  the  remaining  sites.  The  configuration  can  be  either  a  double  star 
clustered  at  the  360/65  sites  or  a  ring  structure.  The  latter  is  probably 
preferable  from  the  standpoint  of  vulnerability  because  it  ha3  articulation 
level  2. 
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CONCLUSION  AND  RECOMMENDATIONS 


4.1  REMARKS 


The  basic  goal  of  the  CACTOS  Project  was  to  develop  guidelines  for  use  in 
the  selection  and  implementation  of  cost  effective  computation  and  communica¬ 
tion  systems.  In  striving  to  achieve  this  goal,  the  Project  sought  to: 

•  Develop  a  methodology  for  describing  and  analyzing  computation 
and  communication  systems. 

o  Determine  DoD  information  processing  and  transmission  needs, 
as  these  apply  to  specific  operational  needs  and  functions. 

•  Investigate  the  cost-effectiveness  of  various  technological 
trade-offs. 


TWelop  optimal  planning  policies  for  the  design  of  computation 
and  communication  networks  and  for  the  incorporation  of  evolving 
technology  into  DoD  systems* 


In  developing  a  methodology,  the  Project  developed  both  an  analytic  and  a  discrete 
simulation  model  of  computation  and  communication  networks.  In  terms  of  what 
could  be  done,  the  models  that  have  been  developed  represent  a  beginning. 

Original  effort  has  gone  into  the  development  of  a  combined  computation-commun¬ 
ication  model  that  is  available  for  on-line  experimentation.  In  addition  to 
requiring  information  on  the  computing  configuration,  the  model  requires 
careful  formulation  of  other  components  of  an  information  processing  system, 
which  include  switches,  multiplexors,  and  man-machine  interfaces.  While 
the  present  model  seems  adequate  for  the  evaluation  of  many  response-time 
and  queueing  questions,  it  could  easily  be  expanded  to  include  capabilities 
for  examining  mmy  other  max-flow,  min-cost  and  resource-allocation  problems. 
Considerable  effort  has  been  expended  in  the  project  on  the  verification  of 
simulation  results  against  real  data.  Each  expansion  of  the  model  snould  be 
treated  similarly  to  ensure  that  simulation  results  reflect  actuality. 
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In  determining  DoD  information  processing  needs,  the  Project  discussed  data 
processing  problems  and  systems  with  many  military  and  governmental  agencies 
and  conducted  investigations  of  several  existing  and  planned  command  and  control 
networks,  including  the  Marine  Corps  Manpower  Management  System  and  the  Air  Force 
Advanced  Logistics  System.  Again,  in  terms  of  what  could  be  done,  this  is  also 
just  a  beginning.  DoD  is  now  in  the  process  of  developing  many  new  syste'  ^  and 
replacing  many  obsolete  systems  with  up-to-date  equipment  and  procedures. 

While  it  is  almost  certain  that  all  of  these  will  receive  a  great  deal  of 
careful  system  analysis,  it:  is  equally  true  that  most  of  them  could  profit  from 
the  sort  of  technological  trade-off  analysis  ..hat  a  CACTOS  project  could  pro¬ 
vide.  .Unfortunately,  neither  adequate  data  processing  requirements  nor  ade¬ 
quate  evaluation  tools  will  exist  without  a  considerable  research  and  develop¬ 
ment  effort  lu  provide  them. 

In  investigating  the  cost-effectiveness  of  technological  trade-offs,  the  Pro¬ 
ject  investigated,  in  some  depth,  the  potential  trade-offs  that  were  available 
to  system  planners.  To  develop  additional  technological  depth,  Project  per¬ 
sonnel  developed  a  preliminary  technological  forecast  of  future  developments 
in  computation  and  communication.  Of  the  many  potential  trade-of  s,  the  ones 
that  the  Project  examined  deal  largely  with  the  economies  of  scale  and 
the  distribution  of  intelligence  (information  processing  power)  within  the 
teleprocessing  system.  Ihe  economies  of  scale  and  the  economies  of  technological 
innovation  seem  Incontrovertible,  but  the  practical  implementation  of  systems 
that  take  advantage  of  these  factors  is  not  imminent.  A  considerable  amount 
of  work  should  be  done  in  the  development  of  practical  replacement  policies 
and  in  the  design  of  new  systems.  While  the  Project  found  evidence  that 
semi-distributed  computing  networks  have  some  advantages,  much  more  needs 
to  be  done  in  examining  the  location  of  processing  centers,  in  allocating 
functions  to  various  levels  in  a  network.,  in  locating  information  stores, 
and  in  assessing  the  advantages  or  disadvantages  of  specialized  processors. 

In  fact,  the  whole  arena  of  technological  trade-offs  has  hardly  been  tapped, 
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and  a  great  deal  remains  to  be  done  before  exact  guidance  can  be  given  to 
.system  designers. 

In  the  development  of  optimal  policies  for  system  design  and  in  the  develop¬ 
ment  of  efficient  replacement  policies,  a  start  has  been  made.  The  Project 
reaffirmed  some  of  the  trade-offs  expressed  in  the  past  and  formulated 
some  extremely  limited  laws  concerning  the  interrelationships  among  computation 
and  communication  elements.  The  development  of  further  system  design  tools  and 
guides  is  largely  dependent  upon  the  continued  evaluation  of  technological  trade¬ 
offs.  Sharpe  [3]  has  made  the  initial  contribution  to  the  structuring  of  this 
field,  but  a  greet  deal  remains  to  be  done.  Just  in  terms  of  developing  cost- 
performance  relationships,  the  only  system  components  fur  which  reasonable 
trends  seem  to  be  established  are  central  processing  units.  Even  here,  a 
myriad  of  factors  inhibit  the  declaration  of  a  clear  set  of  principles  for 
predicting  system  costs  and  performance.  For  many  other  system  elements, 
historical  data  upon  which  to  base  future  predictions  do  not  even  seem  to 
exist.  Developing  such  trend  data  is  partly  Inhibited  by  the  ways  in 
which  data  processing  and  transmission  functions  may  be  combined  within  a  par¬ 
ticular  piece  of  system  equipment.  That  is,  the  development  of  economic  infor¬ 
mation  is  dependent  in  part  upon  studies  of  the  allocation  of  functions  (e.g., 
the  distribution  of  intelligence)  to  various  parts  of  the  system,  which  in  turn 
is  influenced  by  what  is  known  about  costs  of  configuring  a  system  one  way 
or  another.  A  continuation  of  investigations  in  this  area  should  be  of  consid¬ 
erable  benefit  to  the  state  of  the  art  of  teleprocessing  systems. 

Replacement  policy  in  an  era  of  rapid  technological  development  is  certainly  a 
matter  of  great  concern.  Roberts  [1]  has  stated  some  of  the  considerations  that 
impact  a  replacement  policy  for  computers,  such  as  the  number  of  years  before 
the  acquisition  of  a  new  computer,  the  length  of  time  an  old  system  is  to 
overlap  with  the  new,  the  growth  of  the  work  load,  the  relative  advantages  of 
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lease  and  purchase,  the  hidden  costs  of  software,  facility  and  retraining  re¬ 
quirements,  and  the  interrelationships  among  these.  Schneidewinde  [2]  has  also 
formulated  a  model  for  predicting  optimal  replacement  of  computers,  but  in  both 
cases  these  considerations  need  to  be  expanded  to  consider  the  kinds  of  multi¬ 
computer  teleprocessing  networks  of  concern  to  command  and  control  systems.  At 
present,  except  in  the  most  simplistic  terms,  computation  and  communication 
system  replacement  policies  cannot  be  recommended  to  DoD.  It  may  be 
obvious  to  all  that  many  current  DoD  systems  are  technologically  obsolete 
and  probably  economically  inefficient,  but  advising  DoD  on  the  policies 
that  it  should  adopt  to  keep  its  systems  technologically  current  and  optimally 
cost-effective  is  most  questionable  without  further  precise  formulation  and 
evaluation  of  technological  and  procedural  trade-offs. 

Another  trend  that  needs  to  be  addressed  is  the  increasing  degree  of  integra¬ 
tion  of  information  processing  networks.  There  is  a  proliferation  of  systems 
for  both  the  processing  and  the  transmission  of  information.  There  is  an  in¬ 
creasing  need  for  the  exchange  of  data  among  systems.  To  the  DoD  user  of  in¬ 

formation,  there  is  a  definite  need  for  the  separate  information  systems  to  be 
"transparent"  to  his  use.  That  is,  when  he  turns  to  his  control  and  display 
console,  he  does  not  care  where  the  information  is  stored  or  wha*-  system  is 
processing  it.  He  wants  the  needed  information  to  be  delivered  to  him  without 
hyperbole  in  procedure  or  content.  Such  system  integration,  given  the  plethora 
of  existing  systems  and  the  propedures  for  using  them,  is  more  difficult  than 
designing  a  new  system.  Ways  and  means  of  overcoming  system  incompatibilities 
and  of  establishing  data  and  procedural  standards  need  to  he  studied. 

By  and  large,  DoD  is  aware  of  these  problems  and  is  approaching  them,  largely 
on  an  individual  system  basis.  It  is  highly  recommended  that  centralized 

DDR&E  support  be  given  to  such  study  effort,  so  that  DoD-wide  policies  can  be 

established.  Projects  such  as  CACTOS  offer  a  great  many  benefits  to  the  de¬ 
velopment  of  computation  and  communication  systems  'for  command  and  control  appli¬ 
cations.  Much  favorable  notice  has  been  given  the  effort,  hut  much  more  is 
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necessary  before  such  a  project  can  truly  impact  DoD  decision  making.  Some 
auch  centralized  effort  to  develop  network  design  guidelines  and  cost  effective¬ 
ness  evaluation  techniques  should  be  established  on  an  ongoing  basis  to  assist 
DoD  system  procurement  efforts.  i t. o  is  a  vfst  amount  of  detailed  analysis  to 
be  done,  but  these  analyses  coulo  m  a  DoD  a  great  deal  of  unnecessary  effort 
and  expenditure  of  funds  on  auboptimal  systems. 
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APPENDIX  A 


COMPUTATION  AND  COMMUNICATION  TRADE-OFF  STUDIES:  AN  ANALYTICAL  MODEL  OF  COMPUTER  NETWORKS  * 


INTRODUCTION 


It  is  probably  safe  to  say  that  the  most 
vasive  and  significant  development  in  com¬ 
er  usage  during  the  1960's  was  the  rise  of 
time-sharing  system.  The  phenomenal  accep- 
ce  of  time-sharing  as  a  modus  opcrandl  for 
puter  systems  is  a  direct  result  of  the  many 
efits  which  accrue  from  the  basic  technique 
simultaneously  allowing  users  to  share,  on  a 
al  basis,  the  total  resources  of  the  computer. 

It  now  appears  that  the  development  of  the 
0's  which  will  most  closely  parallel  the  time¬ 
ring  phenomenon  of  the  1960's  is  the  rise  of 
puter  networks.  By  computer  network  is 
nt  a  system  comprised  of  two  or  more  com¬ 
ers,  usually  at  different  sites,  connected 
ether  by  communication  links,  in  which  com- 
ation  is  a  primary  function  of  the  system 
not  merely  ancillary  to  the  communication 
ction.  (Communication  may  also  be  a  primary 
tem  function.)  Just  ns  time-sharing  Increased 
power  of  the  computer  through  local  charing 
computer  resources,  so  computer  networks  can 
vide  another  dimension  of  power  to  the 
er's  machine"  through  resource  sharing  on  a 
e  global  scale.  The  resources  shared  in  com¬ 
er  networks  include  not  only  hardware  facil- 
cs,  but  data  and  software  as  well.  The 
lowing  are  some  of  the  efficiency  gains 
ievable  through  the  networking  of  computers: 

1.  Duplication  of  hardware  facilities  can 
eliminated  or  greatly  reduced.  This  is 
ticularly  true  in  networks  which  include  a 

e  variety  of  computer  sizes  and  types, 
ess  to  a  remote  computer  with  some  feature 
Hired  by  a  user  can  eliminate  the  need  to 
chase  a  similar  facility  at  the  user's  site. 

2.  Programs  can  be  made  to  run  on  the 
puters  which  handle  them  efficiently,  rather 
n  being  forced  to  run  on  local  equipment 

ch  may  be  poorly  designed  for  a  particular 
ilem. 

3.  Duplication  of  applications  software 
n  site  to  site  can  be  reduced.  This 
ninates  the  sometimes  nasty  problem  of  pro- 

n  transferability  among  incompatible  machines. 

4.  Electronic  and  manual  transshipment  of 
;e  amounts  of  data,  with  its  associated  costs 

delays,  can  be  eliminated  by  operating  on 
>te  data  bases  over  a  network. 


5.  Queueing  and  overload  problems  at 
certain  facilities  can  be  alleviated  by  load¬ 
sharing  schemes,  whereby  jobs  are  routed  to 
facilities  which  have  lighter  loads.  This 
work=  best,  of  course,  in  networks  with 
similar  or  identical  computer  facilities  at 
more  than  one  node. 

6.  Special  purpose  languages,  which— as 
compiler  construction  techniques  become  more 
sophisticated — appear  to  be  a  cost-effective 
means  of  solving  certain  problems,  need  be 
implemented  on  only  one  computer  which  is 
accessible  through  a  network. 

7.  Overall  system  reliability  can  be 
greatly  enhanced  if  alternate  computer  facil¬ 
ities  can  be  accessed  via  a  network  in  the  event 
of  a  system  failure  at  one  node.  The  topology 
of  the  network  can  be  designed  so  ns  to  minimize 
the  likelihood  of  system  failure  due  to  communi¬ 
cation  component  difficulties,  as  well. 

8.  In  military  and  other  applications 
where  vulnerability  to  attack  or  sahotage  is  a 
significant  consideration,  computer  networks 
with  suitable  topology  characteristics  can 
provide  a  degree  of  invulnerability  which  cannot 
be  achieved  by  single-site  systems. 

9.  Overall  system  degradation  due  to 
errors  or  component  failure  can  be  "graceful" 
in  a  network,  where  as  it  might  be  catastrophic 
if  networking  were  not  part  of  the  system  design. 

In  summary,  the  user  who  Is  communicating 
with  a  network  of  computers  can  have  at  his 
disposal  a  much  more  powerful,  versatile, 
efficient,  and  reliable  tool  than  the  user  who 
is  restricted  to  a  single  computer.  For  these 
reasons,  and  because  technological  progress  has 
brought  the  necessary  concepts  to  fruition,  a 
rapid  proliferation  of  computer  networks  is 
anticipated  in  the  current  decade. 

Careful  analysis  and  design  of  computer 
networks,  therefore,  has  now  become  a  matter  of 
consummate  importance  if  their  full  power  and 
cost  effectiveness  are  to  be  realized.  With 
these  considerations  in  mind,  the  Department  of 
Defense,  throigh  its  Advanced  Research  Projects 
Agency,  has  sponsored  a  broad  program  of 
research  into  the  relevant  issues.  The  results 
of  one  part  of  this  effort,  the  Computation  and 
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Communication  Trade-off  Study  (CACTOS) ,  are 
reported  in  this  paper,  with  an  emphasis  on  the 
quantitative  analytical  toolo  developed  for  the 
study.  The  development  of  adequate  tools  for 
quantitative  analysis  of  the  behavior  of  com¬ 
puter  networks  and  of  the  complex  interrelation¬ 
ships  among  the  many  parameters  involved  in 
their  design  and  implementation  constitutes  an 
important  first  3tep  in  making  the  right 
decisions  about  computer  networks  over  the  next 
several  years- — decisions  which  will  have  major 
impact  on  military,  government,  corporate  and 
public  interests.  Such  tools  arc  necessary  to 
identify  possible  mismatches  between  projected 
needs  and  capabilities  ar.  ’  to  ensure  that  the 
proper  trade-offs  are  being  made  to  best  serve 
the  needs  of  the  entire  computer-using  community. 


THE  CACTOS  MODEL 


Meaningful  analysis  of  computer  networks 
demands  quantitative  analytical  tools;  to  this 
end,  the  CACTOS  analytical  model  was  developed 
and  implemented  under  System  Development 
Corporation's  ADEPT  and  1C0S  time-sharing  sys¬ 
tems.  To  allow  the  user  to  quickly  perform 
experiments  and  explore  conclusions  tentatively 
inferred  from  previous  calculations,  a  fast, 
interactive  tool  was  di  airc-u  which  would  allow 
great  flexibility  and  yet  minimize  user  inputs 
when  the  current  test  case  is  similar  to  a 
previous  one;  the  implementation  of  the  CACTOS 
model  achieves  these  objectives  to  a  high  degree. 

The  primary  performance  characteristics  of 
a  computer  network  are  its  response  time  (time 
between  transmission  of  an  input  from  the  user's 
terminal  and  receipt  at  the  terminal  of  an  out¬ 
put  response  from  the  system)  and  throughput 
(maximum  rate  at  which  the  system  can  perform 
work).  Measures  of  these  characteristics  are 
the  principal  outputs  of  the  CACTOS  model. 
Although  the  two  parameters  are  correlated, 
they  are  not  deterministically  related.  For 
example,  designing  a  system  to  minimize  response 
time  for  a  given  cost  docs  not  guarantee  that 
throughput  will  be  maximized  for  that  same 
cost. 


Inputs  to  the  model  are  the  values  of 
parameters  which  describe  '.be  communication 
hardware,  computation  hardware,  and  workload, 
including  acme  software  characteristics,  of  the 
system  under  study.  Thus,  the  model  does  not 
design  systems;  the  :>sor  designs  systems  and 
the  model  helps  him  h/  estimating  the  perform¬ 
ance  levela  of  the  various  alternatives. 

Figure  1  is  a  schematic  diagram  which 
depict3  the  organization  of  fie  analytical  model 
itself.  At  its  heart  lies  tie  "Comnunicfttiuns 
Queueing  Model."  This  moduli  considers  the 
communications  network,  its  hardware  character¬ 
istics,  its  topology,  certain  characteristics 


of  the  communication! methodology ,  and  the  com- 
municatlcn  workload,  A  queueing  analysis  is 
performed  which  computes  the  average  communi ca¬ 
tion  delay  of  the  whole  system.  The  message 
load  on  each  communication  link  is  computed  by 
the  message-routing  module.  A  topological 
analysis  is  also  performed;  its  primary  value 
is  in  vulnerability  studies  because  it  indicates 
the  minimum  number  of  links  and  nodes  which  must 
be  removed  from  the  system  to  break  communica¬ 
tion.  The  topological  analysis  is  also  impor¬ 
tant  when  one  is  trying  to  correlate  such 
topological  parameters  as  radius  (distance,  in 
links,  from  the  most  central  node  to  a  periph¬ 
eral  node),  diameter  (longest  distance  between 
any  pair  of  nodesX  and  connectivity  (minimum 
number  of  Jinks  connected  to  a  node)  of  a  net¬ 
work  with  the  output  performance  parameters. 

An  analogous  computation  queueing  model 
evaluates  the  computational  load  at  each  node 
and  the  overall  average  delay  due  to  the 
computational  processing  and  associated  queue¬ 
ing.  This  evaluation  considers  the  effective 
processing  rate  of  the  computer  at  each  node 
and  the  frequency  and  size  of  jobs  to  be 
processed  there.  The  effective  processing  rate 
is  generated,  in  turn,  by  the  computer  through¬ 
put  model,  which  considers  both  Lite  character¬ 
istics  of  the  computing  equipment  at  the  node 
and  the  software  characteristics  of  the  jobs  to 
be  processed  there. 

Finally,  the  output  of  the  communication 
and  computation  queueing  models  arc  combined 
to  give  the  overall  response  time  and  through¬ 
put  values  for  the  entire  system. 

Assumptions 

Before  describing  the  model  in  any  detail, 
we  must  dwell,  at  least  briefly,  on  the 
assumptions  which  have  been  made  in  its  deri¬ 
vation.  As  is  the  case  with  any  analytical 
model,  the  user  must  be  carefu1  when  using  it, 
to  be  certain  that  assumptions  made  in  the 
derivation  of  the  model  either  are  true  in 
his  situation  or  have  little  effect  on  the 
results . 


1.  There  are  two  types  of  jobs  being 
processed  by  the  system  being  modeled:  remote 
jobs  and  local  jobs.  A  remote  job  consists  of 
a  message  (date  transmission  across  one  or 
more  links  of  the  network),  followed  by  a  com¬ 
putation  at  the  node  to  which  the  message  was 
addressed,  followed  by  a  return  message.  A 
local  job  consists  of  a  computation  only,  with 
no  demands  on  the  network's  communication 
resources. 

2.  Each  message  has  a  single  source  node 
and  a  single  destination  node. 


-54- 


Reproduced  from 
°6s)  available  copy. 


t.N'IM'TS 


Figure  A-l.  Information  Flow  in  the  CACTOS  Model 


3.  Fixed-minimum- pat  h  routing  in  used. 
This  recnne  that  all  messages  originating  at  a 
particular  node  and  destined  for  another  par¬ 
ticular  node  will  follow  the  same  path,  and 
this  path  will  ho  a  minimum  path  (fewest  links) 
between  the  nodes.  In  the  event  that  there  is 
more  than  one  minimum  path  between  a  pair  of 
nodes,  the  messages  are  unsigned  to  the  least 
loaded  minimum  path  ot  the  time  of  assignment. 
Experiments  have  shown  bar  this  method  of 
message  routing  is  onl/  slightly  inferior  to 
the  mathematically  optimal  method,  and  that,  in 
fact,  the  selected  routes  are  generally  the 
same  in  both  methods.  The  computation  of 
minim „iu  path  routes,  however,  is  much  e.-sier 
than  that  of  optimum  routes. 


simultaneously. 3 

6,  The  various  nodes  behave  independently 
of  one  another.  This  implies,  among  other 
things,  that  there  arc  effectively  no  limita¬ 
tions  on  the  oisc  of  message  buffers,  for,  if 
a  message  buffer  were  to  overflow  at  any  node, 
further  transmission  of  messages  to  (and  through) 
that  node  would  be  blocked,  thus  destroying  the 
assumption  of  independent  node  behavior  which 
our  model  demands.  We  have  found  that  the 
assumption  of  infinite  capacity  message  buffers 
is  quite  valid  if  the  network  is  operating  at 
80%  or  less  of  its  communication  capacity.  All 
networks  which  the  CACTOS  study  has  investigated 
possess  this  characteristic. 


4.  Message  and  job  arriv.,1  rates  and  sizes 
are  described  by  negative  exponential  distribu¬ 
tions.  Empirical  measurements  on  arrival 
statistics  have  tended  to  substantiate-  gamma 
rather  than  exponential  distribution:.,2  but 
the  differences  have  been  shown  to  have  small 
effect  on  the  calculations,  and  the  exponential 
distribution  provides  s  reasonably  good  model 
of  typical  user  requests. 


7.  In  the  communications  network,  the 
effects  of  limited  node  traffic  throughput- 
capacity  are  negligible  compared  to  the  corres¬ 
ponding  link  limitations.  In  effect,  we  are 
as- uming  that  the  nodes  have  an  infinite  traffic 
capacity.  Past  experience  has  shown  that  in 
well-designed  networks  which  are  not  near 
saturation  and  in  which  time  delays  have  been 
minimized,  node  limitations  play  a  minor  role.^ 


5.  Interarrival  times  are  independent  of 
message  lengths  and  job  sizes.  I ;  is  evident 
that  this  is  a  poor  assumption  if  we  are 
describing  a  single  user  or  processing  node, 
but  Kleinrock  has  gone  to  great  lengths  to  demon¬ 
strate  that  it  is  a  reasonable  description  when 
all  users  on  a  sizeable  network  are  considered 


8.  Node  switching  delays  are  constant. 

The  switching  delay,  is,  of  course,  independent 
of  the  node's  message  traffic  throughput  rate 
discussed  above. 

9.  All  message  transmission  and  computa¬ 
tional  processec  are  error-free,  so  that 


retransmissions  and  recomputations  do  not  occur. 

10.  All  communication  is  via  store-and- 
forward  technology;  there  is  no  circuit-switch¬ 
ing  and  no  dedicated  lines  which  are  unavailable 
to  one  or  more  of  the  nodea  of  the  network. 

11.  There  is  no  more  than  one  computer  at 
each  node,  and  no  more  than  one  communication 
link  between  a  pair  of  nodes.  Relaxation  of  this 
assumption  is  planned  for  the  near  future. 

12.  Neither  multi-processing  nor  multi¬ 
programming  is  explicitly  accounted  for  in  the 
model. 


parameters  of  interest  which  are  also  "wind fa 
from  the  message  routing  scheme  are  the  radius 
and  the.  diameter  of  the  network.  A  simple 
example  illustrating  these  ideas  i9  shown  in 
Figure  2. 


The  Routing  A]  gor  1  t.hm 

It  is  important  for  us  to  know  the  volume 
of  message  traffic  over  each  link  in  a  network. 
The  frequency  of  messages  on  any  given  link 
depends  on  the  communication  traffic  pattern, 
the  network  topology,  and  the  routing  strategy. 

The  f ixed-minimum-path  routing  strategy 
has  already  been  discussed  as  a  model  assumption. 
Network  topology  and  traffic  frequency  will  be 
represented  by  matrices.  Kntries  in  the  connec¬ 
tivity  matrix  will  ho  defined  as  C..  ■  1  if  there 
is  a  communication  1  ink  from  nod,.-  iJ  to  node  j; 

C  -  0  otherwise,  ,  the  i-jth  entry  irt  the 
job  arrival  matrix,  represents  the  number  of 
jobs  originating  at  node  i  to  lie  processed  at 
node  J,  per  time  period.  (This  means  that  a 
message  will  he  sent  from  node  i  to  node  i,  and 
a  return  message  will  he  sent,  from  j  to  i 
Finally,  the  traffic  matrix  will  he  composed  of 
“  frequency  of  message;,  across  the  link  from 
i  Lo  J.  .It  is  the  jot,  of  the  routing  algorithm 
to  build  the  traffic  matrix  from  the  other 
inputs.  Note  that  if  we  deline  the  operational 
(as  opposed  to  topological)  average  path  length 
to  be  the.  average  number  of  links  traversed  by 
a  message,  this  quantity  is  calculable  from 
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Figure  A-?.  An  Example  of  Message  Routing 
Inputs  and  Outputs 

The  algorithm  selected  for  traffic  routing 
is  a  piodif it ation  of  Dijkstra's  tree-hullding 
algorithm  for  finding  the  least-cost  paths  ft:.'', 
one  node  to  all  other  nodes  in  a  network. In 
this  application,  the  link  cost  is  artificial!-.' 
set  to  the  number  of  messages  already  assigned 
to  a  link,  plus  C,  where  0  >  2  y.  Tills  mode 
of  setting  the  cost  forces  the  cost-minimizing 
algorithm  to  select  the  shortest  path  first, 
and  the  least-loaded  paths  second  if  there  is 
more  than  one  shortest  path,  which  is  exactly 
the  scheme  desired.  The  mathematical  optimality 
of  Dijkstra's  algorithm  and  the  fact  that  Cl  > 

2  y  guarantee  that  minimum-link  paths  will 
always  be  selected;  however,  the  second-order 
balancing  may  be  sensitive  to  the  order  in  which 
node  pairs  are  a  signed  routes.  It  has  been 
found  empirically  that  imbalances  tend  to  he 
minimized  if  all  node  pairs  separated  by  paths 
of  length  one  are  assigned  routes  first,  all 
node  pairs  separated  by  paths  of  length  two  are 
assigned  second,  and  so  on  up  to  the  dit meter 
of  the  network.  Thi3  scheme  is  ltnplem«r..ed  in 
the  CACTOS  model. 


This  figure  is  returned  as  an  output  of  the 
CACTOS  model  and  has  been  found  to  be  a  signifi¬ 
cant  system  design  parameter.  Topological 


The  basic  Communications  Model 

The  details  of  a  basic  model  describing  the 
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eh.'.vior  of  a  storo-and-forvard  coKznur.i'.acions 
,etv;ork  have  been  described  quite  clearly  and 
ill  not  be  re-derived  here.-’»^  Kleinrock's 
oraula  for  average  message  delay  is: 
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1  l1*  c. 
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+  _> _ i 

Uci-Ai 
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(1) 


i/here 


A^  *  sses>  sage  frequency  over  link  i.  (Kute 
tlmt  summation  here  is  over  links, 
rather  than  node  pairs,  a3  was  done  in 
the  previous  section.) 

Y  *  overall  system  mesisnge  input  rate. 

l/b'«  content  message's  overage  length. 

■  channel  capacity  of  link  1. 

1/u  »  average  length  of  r.’.essagas,  including 
acknowledgements. 

v  -  propagation  rate  in  the  cccuunleation 
links  (usually  at,  or  near,  the  speed 
of  light). 

-  length  of  link  1. 

M  "  number  of  links  in  the  network.  (Duplex 
lines  arc  treated  oc  two  independent 
links. ) 

K  ■  nodal  switching  delcy. 


In  this  expression,  1/u'C^  is  the 


transmission  tine. 


V"ci 


is  ths  queueing 


Pcl-A1 

delay,  and  /  v  is  the  delay  for  propagation 
through  the  Medium  of  the  communication  llnku. 
The  sum  of  these  ferns,  plus  a  nodal  swltc.h- 
‘i-og  delay  K,  is  weighted  by  A^/y,  which  has 
the  effect  of  multiplying  the  aver nge  delay  psr 
Link  by  the  operational  average  path  length, 
or,  equivalently,  weighting  each  link's  aver¬ 
age  delay  by  the  amount  of  traffic  which  it 
carries,  and  then  coking  on  overall  overage 
^clay  for  t ho  system.  Finally,  another  K  is 
lidded  in  to  account  for  the  final  switching 
delay  at  tho  destination  node. 

Ons  word  about  tho  difference  between  u 
and  p*.  As  a  technique  for  error  control,  many 
networks  require  some  kind  of  acknowledgement 
message  to  verify  each  correct  transmission. 

Tha  cranomianion  time  for  n  real  tnesnugc 
depend*  only  on  ita  own  nice  and  tho  channel 
capacity!  honco  p'  ia  used  In  calculating 
transmlnslon  delay.  (Juouelng  dalay,  liowuvsr, 
ponds  on  tha  overall  loading  of  a  link, 


Including  acknowledgement  traffic.  Since  the 
sice  of  an  acknowledgement  message  is,  in 
general,  different  from  the  size  of  a  "content" 
message,  a  different  average  message  size, 
namely  1/b,  must  be  used  in  the  calculation  of 
queueing  delay.  Ttie  operational  implementation 
of  the  CACT03  model  allows  the  user  to  choose 
whether  or  not  the  effects  of  acknowledgement 
messages  are  to  be  taken  into  consideration. 

The  interpretation  of  results  from  any 
analytical  model  must  be  made  in  such  a  fashion 
ns  to  accurately  reflect  characteristics  of 
interest  in  the  system  being  modeled.  In  the 
actual  use  of  the  message  delay  model  of  equa¬ 
tion  (1),  several  applicationr.-oricnt.cd  ques¬ 
tions  nrose.  These  resulted  in  some  modifica¬ 
tion  of  Kleinrock’s  work  to  better  suit  the 
purpose  of  the  CACTOS  study. 

Message  Size  Variability 

Messages  on  different  lines  of  a  real  net¬ 
work  wil)  probably  he  of  different  average 
sizes,  and,  in  fact,  the  message  sizes  arising 
from  different  sources  may  fit  different  statis¬ 
tical  distributions. 

Kleinrock's  equations  use  standard  message 
size  1/u  and  1/u'  throughout  the  network:  the 
differences  in  mean  message  sizeB  on  different 
links  nay  be  accounted  for  by  merely  sub- 
acripting  b  and  u’.  Message  sizes  are  then 
computed  separately  for  each  link  in  the  network 
and  are  used  acparatcly  in  the  Individual 
calculations  of  delays  on  each  link.  In  prac¬ 
tice,  tho  traffic  going  over  each  link  is  a 
function  of  the  original  source-destination 
truffle  and  message-size  matrices  and  tho 
routing  pvocrdurc.  If  individual  average  mes¬ 
sage  sizes  are  to  be  calculated  for  each  link, 
it  is  most  convenient  to  save  complete  informa¬ 
tion  on  message  traffic  noeignmenta  ns  they  nro 
fixed  by  tho  routing  procedures.  Thus,  if  tho 
total  number  of  messages  and  the  total  number 
of  message  hits  are  kept  for  each  link  in  tho 
network  ns  they  are  noulRned  by  tho  routing 
procedure,  the  mean  message  sizes,  J/u^»  muy  ba 
readily  calculated. 

The  degree  of  sensitivity  of  the  mode)  to 
this  change  has  not  boon  assessed  for  any  real 
networks.  It  will,  in  all  likelihood,  bo 
greater  in  networks  with  very  diverse  messORe 
loads  over  tbe  different  links.  Consider,  for 
example,  the  network  Bhown  in  Figure  3,  One 
can  Imagine  a  regionalized  computation  Bystem 
with  thiB  kind  of  topology,  where  reRional  data 
input  centers  send  short  data  messsROs  to 
computational  centers.  These  centers,  in  turn, 
accumulate  data  and  then  send  very  lntRe 
messages  to  other  computation  centers  for  stor¬ 
age  or  computation.  In  such  a  configuration 
the  difference  in  mesBURO  sires  over  remote 
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find  central  links  could  be  a  crucial  considera¬ 
tion. 


Center*  Centers 


that  by  using  X.,  gather  than  X.,  wo  will  bo 
weighting  delays  according  to  the  flow  of  con* 
font  messages  only  on  each  link,  rather  than  all 
messages  combined,  a  distinction  not  made  bv 
Kleinrock.  Since,  for  our  performance  model , 
we  are  interested  only  in  delays  encountered 
by  content  messages,  we  will  make  this  change, 
(y,  of  course,  must  also  reflect  only  content 
messages.)  Thus,  for  a  link  which  carries  only 
acknowledgement  messages  from  node  i  to  node  j, 
the  contribution  to  the  overall  response  tine 
is  zero,  a  situation  which  reflects  our  interest 
in  the  delays  encountered  by  content  messages 
only. 


Figure  A-  3.  Regionalize.!  Computation  System 
1  re  3 t,  men  i _ of  Ac  know!  <•  d  e 1,  merit  lie  s  sago  3 

One  technique  for  the  control  of  error  and 
reliability  in  common  practice.  is  the  use  of 
acknowledgement  messages.  In  systems  employing 
only  positive  acknowledgement  messages,  accurate 
receipt  of  a  message  at  a  node  automatically 
generates  a  return  message  to  the  transmitting 
node  indicating  that  the  message  was  correctly 
received  and  retransmission  is  unnecessary. 
Messages  are  periodically  transmitted  by  the 
sending  nodes  until  an  av k.iowl edgement  is 
received.  In  other  networks,  an  error  in  mes¬ 
sage  transmission  way  generate  a  negative 
acknowledgement  which  causes  retransmission  of 
the  message. 

We  will  assume  a  perfectly  functioning 
positive  acknowledgement  system,  i.e.,  each 
message  generates  an  acknowledgement  along  the 
same  duplex  line  in  the  opposite  direction, 
and  no  retransmissions  are  necessary. 

With  subscripted  message  sizes,  equation 
(1)  contains  separate  terms  for  the  average 
rueueing  delay,  X±/ [  u±  -  X  ±)  ).  and 

transmission  time,  l/pJ  r;(,  of  a  message  on  link 
i.  Of  course,  message  transmission  time  is 
independent  of  consideration  of  acknowledge¬ 
ment  messages,  Ixing  a  function  oniy  of  the 
link  s  channel  capaci ty  and  the  size  of  the 
transmitted  message.  Thus  transmission  delay 
should  now  be  -1 /u jC  . 


With  the  changes  for  variable  message  sizes 
and  a  different  treatment  of  acknowledgement 
messages,  the  equation  for  communication  delay 
as  used  in  the  CACTOS  studies  is  given  by 
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(2) 


The  Computer  Throughput  Model 


Response  time  in  a  rrnmuter  network 
depends  on  the  processing  rate  of  its  computers 
as  well  as  the  processing  rate  of  its  communica¬ 
tion  facilities.  The  computation  of  the  effec¬ 
tive  processing  rate  of  a  computer  is  an 
extremely  complex  problem,  being  a  function  of 
at  least  hundreds  of  hardware  and  software 
parameters.  Many  approaches  to  the  problem  of 
estimating  computer  throughput  have  been 
attempted,  some,  of  them  involving  step-by-step 
discrete-state,  simulation  and  some,  involving 
the  construction  of  analytical  models.  The 
requirements  of  the  CACTOS  program  dictated 
that  the  computer  throughput  model  be  fast 
enough  that  the  answers  are  received  v'rtually 
instantaneously,  simple  enough  that  the  user 
inputs  are  minimal,  and  yet  detailed  enough 
that  computation  parameters  might  be  meaning¬ 
fully  "traded  off"  with  communication  parameters. 
Speed  and  simplicity  requirements  quickly 
eliminated  discrete-state  simulation  as  a  poten¬ 
tial  technique. 


Delay  time  on  queue,  however,  is  a  func¬ 
tion  of  the  total  load  on  the  system,  including 
acknowledgement  messages.  Thus,  the  overall 
mean  queueing  delay  on  line  i  is  still 
Xj/[Pj  C  f lJ <  Cj  -  X,]  where  the  unprimed  varia¬ 
bles  reflect  trie  arrivul  rates  and  sizes  of  all 
messages,  acknowledgements  Included. 

The  weighting  ) actor  for  communication 
delays,  X^/y,  is  chosen  to  reflect  delays  for 
the  messages  of  Interest;  the  particular  choice 
depends  on  the  objectives  of  the  analyst  when  R 
is  used  as  a  criterion  for  optimization.  Note 


The  question  whether  or  not  a  simple  and 
meaningful  analytical  model  of  computer  through¬ 
put  can  be.  constructed  is  a  moot  one  and 
depends  mainly  on  the  model's  intended  applica¬ 
tions.  For  trade-off  studios  of  the  scope  and 
generality  of  the  CACTOS  program,  the  analvti^a! 
approach  tuken  here  was  adequate.  Moreover,  it 
is  felt  dial  the  approach,  whereby  such  analy¬ 
tical  models  may  be  fairly  readily  constructed, 
is  at  least  as  important  ns  the  results.  A 
small  number  of  relevant  hardware  and  software 
parameters  was  selected  for  the  CACTOS  model, 
but  the  approach  is  of  sufficient  generality 
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and  open-endedness  that  different  parameters, 
and  more  of  them,  might  be  cimilarly  included 
as  the  analyst  requires. 

This  approach  presupposes  a  reference  hard¬ 
ware  configuration  with  known  throughput  para¬ 
meter  values  against  which  other  configurations 
arc  to  be  compared.  Any  standard  configuration 
could  be  used  as  a  reference;  let  us  arbitrar¬ 
ily  select  an  IBM  360/50  with  model  2314  disc 
units  and  512K  bytes  of  core.  The  list  of  hard¬ 
ware  end  software  parameters  which  we  wish  to 
include  in  the  model  Is  shown  in  Table  1;  as 
just  pointed  out,  this  list  is  arbitrary  and 
could  be  easily  amended  to  suit  a  user's  par¬ 
ticular  needs. 

Although  there  is  much  discussion  aa  to 
whst  the  proper  units  of  throughput  should  be, 
we  will  adopt  the  fairly  artificial  unit  of 
modified  blts/mllllsecond,  comparable  to  the 
modified  blts/aecond  uaed  by  Roberta  In  hia 
studies  of  trends  in  the  costs  of  computer 
throughput.”  Thus,  a  Job's  size  In  this  modal 
is  described  in  units  of  modified  bits,  which, 
when  divided  by  the  effective  processing  rate 
output  by  the  computer  throughput  model,  yields 
the  amount  of  tine  the  Job  would  consume  or.  ths 
hardware  configuration  in  question* 

Before  constructing  e  throughput  model,  ve 
will  need  some  definitions: 


The  rationale  Is  that^’ a  given  job  is  organized 
in  a  particular  way,  such  that  it  issues  (or 
can  issu:)  an  I/O  command  at  a  given  point  in 
its  computation  sequence,  regardless  of  the 
hardware  configuration  upon  which  it  is  run. 

TABLE  A-l.  COMPUTER  THROUGHPUT  MODEL  INPUT 

PARAKETI RS 


Hardware 

Instruction  rate 
Word  Rite 

Primary  memory  size 
Peripheral  descriptors 
e  Average  access  time 
e  Transfer  rate 


Software 

Ratio  of  computation 
time  to  total  time 
consumed 

CPU  -  I/O  overlap 
I/O  -  I/O  overlap 


e  Maximum  amount  of  information 
which  may  be  transferred  on 
one  access  (e.g.,  cylinder 
size  for  a  disc) 


Pigure  4  slso  gives  us  a  clue  os  to  how  to 
go  about  estimating  throughput.  Since  the 
processing  rate  la  Inversely  proportional  ro 
the  required  time  for  a  given  unit  of  work  (we 
use  the  modified  bit),  ve  noed  only  add  up  the 
times  shown  in  Figure  4c  and  lnvort  to  get  e 
processing  rate.  Thus  . 


TP  ■  throughput  (effective  processing 
rate  for  the  hardware-software 
combination  under  consideration). 

*CPU  "  col°Put*t^on  time. 

TjQ  »  Input-output  time. 


TP  "  (1"n)  TCPU  +  n  TCPU  +  ^10  “  n  TC 
which  simplifies  to 


TP  "  <1”n>  TCPU  *  V  TI0 


f  ■  fraction  of  e  job's  total  time 

which  la  spent  in  computation  (as 
opposed  to  I/O)  if  it  la  run  on 
the  reference  hardware. 

n  •  fraction  of  CPU  time  overlapped  by 
I/O. 

v  ■  time  a  job's  I/O  takes  when  It  is 
overlapped  divided  by  the  time  the 
same  I/O  takes  when  performed 
sequent tally. 

Figure  4  shows  a  typical  CPU-I/0  cycle  In 
various  degrees  of  overlap  which  should  clarify 
the  preceding  definitions.  Two  things  should  be 
noted  here.  One  is  that  the  range  of  v  i« 
from  1/c  to  r  where  c  la  the  number  of  I/O 
channels,  since  with  full  utilization  of  all 
channels  the  1/0  time  could  not  be  leas  than 
Tj0/c.  Also  rote  that  n,  the  degree  of  I/O-CPU 
overlap ,  Is  associated  with  tho  job  alone  and  is 
independent  of  ths  hardware  under  consideration. 


Equation  (3s)  is  valid  when  I/O  operations  are 
not  completely  overlapped  by  computation,  e 
condition  expressed  algcbracicly  by  nTcpu  <  vTjQ 

When  nTCpy  i  vT  ,  l/°  **  completely  over¬ 
lapped  by  computation,  a  condition  illustrated 
in  Figure*  5.  In  this  case,  the  CPU  rate  is  the 
sole  factor  determining  throughput  and  va  must 
use  the  equation 


Equations  (3a)  and  (3b)  constitute  ■  throughput 
model  once  we  have  a  way  to  compute  T__„  and  T_ 
for  the  machine  and  workload  under  consideration. 


First,  If  we  assume  that  a  computer's  CPU  Is 
capable  of  processing  p  inatructlons/ma  and 
that  an  Instruction  is  capable  of  modifying  v 
bits  (v  is  generally  the  computer's  word  size), 
then  tho  tins  par  modified  bit  la  1  .  For  a 

vf 

given  job  with  a  fraction  f  of  its  total  time 
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spent  in  computation,  we  have 


(a)  No  ovtrlap  (v“l,  n*0) 


(b)  2-clunnt'l  overlapping  I/O  oprrstlon* 
No  I/O  -CPU  overlap.  (n-0) 


( ■-.)  2-cIuunel  overlapping  I/O  operations 
I/O  -  CPU  overlap 


/5. 12*10  \  accesseskon  a  machine  with  m  bytes  c? 

'  in  ' 

core  memory.  From  a  large  amount  of  timing  ca*--, 
for  a  real  program  on  many  hardware  configura¬ 
tions,  a  has  been  estimated  to  have  the  value 
0.83.  The  determination  of  a  will  be.  fully 
documented  as  part  of  a  forthcoming  document  cr. 
validation  of  the  CACTOS  model. 


Figure  A- 5.  A  CompuLy-I/O  Cycle  in  which  I/O 
is  Completely  Overlapped  bv  Computation 


Figure  A-4.  A  Typical  Compute  -  1/0  Cycle  in 
Varying  Degrees  of  Overlap  (CPU  time/total 
time  “  f  “  1/3) 

The  calculation  of  T  is  somewhat  more 
difficult.  First,  we  need  to  convert  the  units 
of  I/O  work  into  the  equivalent  amount  of  work 
in  modified  bits.  Also,  it  is  non-rrivial  to 
assess  the  effects  of  the  size  of  primary  memory 
on  I/O  time. 

We  begin  by  asserting  that  a  Job's  I/O  time 
is  proportional  to  the  number  of  I/O  accesses, 
the.  average  duration  of  each  access,  and  the 
proportion  of  the  job's  time  spent  in  I/O 
operations,  i.e., 

Tjq  “  ^  (number  of  accesses)  (average 
access  duration)  (1-f) 

The  number  of  accessea  required  is  a 
function  of  the  primary  memory  size:  the 
greater  (he  nemory  size  of  the  computer  under 
consideration,  the  smaller  the  number  of 
required  accesses.  We  will  assume  that  the 
number  of  accesses  ie  inversely  proportional 
to  some  power  a  of  the  memory  size.  Then, 
eir.ee  the  reference  hardware  haa  5.12  *  lO-* 
by  tea  of  core  memory  one  access  on  the 
referf'.nce  hardware  would  correspond  to 


The  average  duration  of  an  I/O  operation 

is  Riven  by  a  +  L  where  a  »  average  access 
x 

time,  r  »  average  record  size,  and  x  c  transfer 
rate  of  the  storage  device.  But  we  know  that 
a  larger  primary  memory  would  permit  the  con¬ 
struction  of  larger  records  for  secondary 
memory,  a  strategy  which  permits  a  gain  in  1 '0 
efficiency.  If  we  assume  that  primary  memory 
size  and  secondary  memory  record  size  are,  in 
fact,  proportional,  then  the  duration  of  an 
I/O  operation  on  a  machine  with  m  bytes  of  core 
memory  would  be  a  +|5.12*]05j^J  A  fJn'11  PolRt 

here  is  that  little  is  gained  if  we  are  oper¬ 
ating  on  a  device  which  can  handle  only  a 
limited  amount  of  information  without  making 
another  access.  On  a  disc,  it  is  not  par¬ 
ticularly  beneficial  to  increase  the  record 
size  beyond  a  cylinder's  capacity.  The  imple¬ 
mentation  of  the  CACTOS  model  is  cognizant  of 
this  and  does  not  adjust  the  record  size  beyond 
that  of  the  cylinder  capacity  or  comparable 
quantity  on. the  I/O  device  under  consideration. 


If  we  now  let  C  “  cylinder  size,  we  can 
write  the  expression  for  I/O  time: 


-60- 


(  5. 12*10^  V* 

/a  +  mr  'i 

|  (1-0, 

[Q (X)  -  Q(X 

'  m  / 

\  (5.12*105)  x  I 

f 

and 


5.i:-*io5  C 
ro  fi  r 


ha'k(iY  (■♦§)(“)•■ 


5.12*10  C 


We  need  only  evaluate  k  to  complete  our 
computer  throughput  model.  To  do  this,  consider 
a  job  which  Is  half  computation  and  half  1/0 
running  on  the  reference  hardware,  with  record 
size  equal  to  the  2314  track  ui  ze.  We  know 
that  for  this  Job  and  hardware  configuration, 

TI0  “  TC?U'  or 


l 

wp 


k(5. 12*10 


SW*  * 


/\  (5.12*10/ : 


<1-0 


Upon  aibstituting  f  -  0.5,  m  <•  5.12*105, 
and  the  manufacturer's  published  figures  for 
v,  p  (we  use  the  inverse  of  the  add  time  but 
it  may  be  more  desirable  to  use  the  instruction 
rate  for  a  typical  instruction  mix),  a,  r,  and 
x,  we  may  coolly  solve  the  equation  to  get 


k  -  1.6S*106. 


Integration  of  the  Parts 

To  complete  the  whole  computer  network 
model,  we  need  to  do  three  more  things: 
compute  message  delay  from  the  packet  delay 
given  by  equation  (2),  compute  computational 
delay  using  the  output  of  the  computer  through¬ 
put  model,  and  sum  the  average  communication 
and  computation  delays. 

large  messages  are  not  generally  trans¬ 
mitted  through  a  network  in  one  piece  but  ore 
divided  into  smaller  packets  which  may  be  more 
readily  handled.  The  pockets  are  sonr 
separately  through  the  network  and  reassembled 
at  the  destination  node.  It  is  not  adequate 
to  treat  this  procedure  in  the  model  by 
dividing  the  average  message  sizes  by  the 
number  of  packets/messege  and  multiplying  the 
arrival  rate  by  the  same  number.  Instead, 
one  must  consider  the  actual  distribution  of 
message  sizes. 

If  the  cumulative  distribution  function 
desi  ribing  message  sizes  la  Q (X)  »  Prd/pj,  <  X), 
then  the  fraction  of  messages  of  size  less  than 
X  is  simply  0(X).  If  Z  is  the  maxim im  number 
of  bits  in  a  packet,  then,  by  allowing  X  to 

assume  the  values  Z,  2Z,  3Z . corresponding 

to  1,  2,  3,  ...,  packets,  we  may  easily 
compute  the  number  of  messages  requiring  X 
packets  for  transmission,  v^hlch  is 


Q(0)  -  0. 

The  total  number  of  packets  over  each  link 
1  in  the  network  nay  then  be  readily  calculated 
and  replaces  X^'  in  equation  (2).  (X^  must 

also  be  adjusted  to  reflect  packet  traffic.) 
Dividing  the  total  number  of  bits  transmitted 
by  the  number  of  packets  required  gives  the 
new  average  content  message  size,  1 / u ' ;  again, 
P.  must  be  appropriately  adjusted  In  the 
straightforward  way. 

Table  2  shows  how  the  number  of  packets 
and  average  packet  size  arc  calculated  for  a 
sample  of  1000  messages  with  an  average  message 
size  of  100  bits  and  a  maximum  packet  size  of 
100  bits.  In  this  case,  un  expected  1578 
packets,  of  average  size  63.4  bits,  would  be 
required . 

T Ab I.F.  A-2 .  CALCULATION  OF  THF  NUMBER  AH"' 
AVERAGE  SIZE  OF  PACKETS  WITH  AVERAGE 
MESSAGE  SIZE  »  100  BITS  ANO  MAXIMUM 
PACKET  SIZE  -  100  BITS 


No.  of 

Packets 

Per 

Mcssupe 

Message 

Size 

Range 

(Bits) 

Q(X) 

No,  of 
Mcssap.es 

No.  of 
Packets 

1 

1-100 

.6321 

632 

632 

2 

101-200 

.8647 

233 

466 

3' 

201-300 

.9502 

86 

258 

4 

301-400 

.9812 

31 

124 

5 

401-500 

.9933 

12 

60 

6 

501-600 

.9975 

4 

24 

7 

601-700 

.9991 

2 

14 

Total 

1000 

1578 

Average  Packet  Size  - 

1  j  7o 

63.4  bits 

Two  questions  about  statistical  validity 
arise  as  the  result  of  abandoning  the  message 
as  the  individual  atom  being  transmitted 
through  the  network  and  treating  messages  as 
groups  of  smaller  amounts  of  information, 
called  packets. 

Important  to  the  calculations  is  the  assuwn- 
tlcn  that  message  arrivals  are  indepziident  of 
message  lengths,  an  assumption  discussed  at 
length  by  Klcinrock.-*  When  considering  long, 
undivided  messages  arriving  at  nodes,  it  is 
clear  that  this  assumption  becomes  less  valid, 
since  the  minimum  i nteir rr ival  time  between 
long  messages  must  be  alfected  by  the  long 
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transmission  limes  associated  with  them.* 

Thus,  one  might  expect  an  improvement  in  statis¬ 
tical  validity  by  treating  long  messages  as 
groups  of  shorter  ones. 


On  the  other  hand,  the  introduction  of  the 
concept  of  packeLs  in  the  manner  described 
creates  other  perturbations  which  may  affect 
the  calculations  even  more.  The  derivation  of 
equation  (2)  assumes  that  the  sizes  of  the 
units  of  information  being  transmitted  through 
the  network  are  taken  from  an  exponential 
distribution.  This  is  not  likely  to  be  the 
case  when  packets  are  used,  because  packet 
sizes  will  never  exceed  the  maximum  allowable 
packet  size.  As  Kleliirock  points  out,-*  how¬ 
ever,  there  is  an  easy  treatment  of  thla 
dilemma  by  resorting  to  the  Pollaczek-Khinch in 
formula  for  channel  delay  with  any  message- 
lengch  distribution  of  known  mean  1/p^  and 
variance  o,^: 


2- 


i 


delay  on  channel  i 


UiCi 


!l  - 


ui  0 


fJi  -  \j 


\ 

3 


(4) 


Alt'iOUf,).  ee  have  i to L  done  it  for  either 
system  under  study,  equation  ('.)  could  easily 
be  incorporated  into  the  response  time  equa¬ 
tions,  and  and  could  then  be  determined 
from  analysis  of  the  system's  packet  traffic. 
While  this  would  end  the  assumption  of  all 
random  processes  within  the  system  being 
governed  by  negative  exponential  distributions, 
it  might  be  a  better  approximation  to  the  true 
performance  of  a  packet  system  than  is 
represented  by  equation  (2). 


The  second  question  of  validity  concerns 
the.  distribution  of  the  arrival  time  of  mes¬ 
sages  and  jobs  at  lie  computation  nodes.  If 
one  considers  the  .rrival  of  messages  at 
destination  nodes  where  a  message  consists  of 
some  number  of  packets  which  make  their  way 
through  the  system,  then  both  theoretical  con¬ 
siderations?  and  measurements  on  analogous 
systems'-  suggest  that  a  gamma  distribution  best 
describes  message  arrivals  at  terminal  nodes. 


*  But,  as  Kleinrock  points  out,  this  effect  ii 
minimized  when  a  large  system  (many  source 
nodes)  is  considered  because  arrivals  at  one 
node  are  independent  of  message  lengths  at 
other  nodes,  and  thus  the.  overall  arrival  rate 
into  a  large  system  tends  to  approximate 
independence  of  all  iessage  lengths,  a  con¬ 
clusion  well  subs  .antiated  by  simulation 
results . 


To  explore  th^s  possibility  a  bit  further, 
assume  now  that  packet  arrival  in  the  system 
is  a  Poisson  process  with  mean  arrival  rate  1  , 
(We  know  that  at  a  given  source  node,  this  P 
would  be  a  terrible  assumption  because  packets 
arrive  in  groups  which  constitute  a  message, 
followed  by  a  pause  until  the  next  message  has 
been  constituted  and  received.  Tut  again,  if 
we  consider  a  large  system,  we  can  make  an 
assumption  similar  to  Kleinrock1 s  and  say  that 
internr rival  times  for  the  system  as  a  whole  are 
independent  of  both  transmission  times  and 
blocking  considerations  and  thus  constitute  a 
Poisson  process.) 

Packet  interarrival  times,  then,  are 
governed  by  an  exponential  distribution  whose 
cumulative  distribution  function  is 
-X  t 

0(c)  »  1  -e  P 

If  there  is  an  average  of  n  packets  per 
message,  then  the  gamma  distribution  describing 
message  arrivals  is 


Further  evidence  of  the  credibility  of 
the  gamma  distribution  here  Is  obtained  from 
consideration  of  che  special  case  where  mes¬ 
sages  are  not  divided  into  packets;  i.e.,  n  =  3. 
This  leads  to 


C(t) 


which  is  the  exponential  distribution  used  for 
the  arrival  of  one-packet  messages  by  Kleinrock 
in  the  derivation  of  the  original  model. 


The  queueing  analysis  of  computation  jobs 
at  network  nodes  depends  on  the.  assumption  that 
message  arrivals  at  these  nodes  constitute  a 
Poisson  process.  If  a  gamma  distribution, 
rather  than  an  exponential  distribution, 
describes  these  arrivals,  the  model  is  obviously 
inaccurate  v,.en  messages  are  split  into  packets. 
Unfortunately,  a  mathematical  analysis  of  queue¬ 
ing  and  response  times  based  on  gamma  statistics 
does  not  appear  tractable,  so  the  ramifications 
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f  thi.fi  development  cr.n  probably  be  deduced  only 
rom  a  simulation.  Fuchs  und  Jackson,  who 
uund  by  measurement  that  arrival  statistics 
n  four  time-sharing  systems  were  gamma 
istributod,  addressed  themselves  to  errors 
nducf.d  by  substituting  exponential  distribu- 
ions. *  differences  between  the  cumulative 
ictributions  were  plotted  ns  o  function  of  the 
oofficienc  of  variation  of  the.  gamma  distribu- 
ion,  but  these  say  little  about  how  they 
elate  to  errors  in  the  final  outputs  of  the 
odel ,  V!e  will  continue  to  treat  packet 
.rrlvals  as  a  Poisson  process  and  assume  that 
het  errors  thus  induced  are  small. 

After  making  the  necessary  changes  to 
:ompute  message  delay  by  computing  the  number 
if  packets  required,  which  we  will  call  b,  and 
Multiplying  the  message  arrival  rate  by  b,  T^^ 
[communication  delay)  may  be  properly  computed'* 
:rom  equation  (2). 

The  computational  delay  at  a  node  i  may 
>e  computed  from  the  simple  sing] j-server 
iarkovian  arrival  queueing  iormula 

TC°HP 

.•here  *  mean  job  sire  at  node  i 

TP  "  throughput  rate  as  calculated  from 
equation  (3) 

XJ  -  mean  Job  arrival  rate  at  node  i 

The  overall  average  computational  delay 
say  be  computed  from  the  weighted  average 


where  N  ■*  number  of  nodes  in  the  network  and 
H 

xf  ■  total  job  input  rate 
for  the  network 

iri 

Finally,  if  we  define  X  to  be  the  number 
of  remote  Jobs  divided  by  the  total  number  of 
Jobe  (remote  jobs  appear  in  positions  other 
than  the  diagonal  of  the  Job  arrival  matrix, 
and  ore  the  only  ones  which  require  inter¬ 
node  communication)  end  remember  that  two 
messages  are  associated  with  each  remote  job, 
the  overall  average  response  time  for  the 
system  may  be  computed  from 


SUMMARY 

In  this  paper,  the  need  for  quantitative 
modeling  of  computer  networks  has  been  discussed, 
and  one  approach  to  the  construction  of  an 
analytical  model  of  computer  network  pcriormance 
has  been  outlined.  The  validation  of  the  model 
and  soma  results  obtained  by  using  it  in  cost- 
ef f ectivenruE  trade-off  studios  are  to  be  topics 
of  future  publications. 

The  definitions  of  parameters  in  this  paper 
have  been  given  more  from  the  point  of  view  of 
the  research  scientist  than  from  that  of  the 
cystom  designer.  The  research  scientist  is 
interested  in  having  the  parametric  description 
of  a  given  job  remain  invariant  over  all  hard¬ 
ware  configurations;  therefore,  such  job  para¬ 
meters  as  the  ratio  of  CPU  time  to  total  time 
and  the  degree  of  1/0  -  I/O  overlap  always 
refer  to  the  performance  of  the  job  on  the 
reference  h-i*  ware.  To  the  system  designer, 
it  may  be  inconvenient  to  have  to  describe  a 
real  job  in  terms  of  its  behavior  on  a  config¬ 
uration  on  which  it  has  never  been  run.  It  is 
possible,  however,  to  develop  formulas  for  the 
translation  of  parameter!)  measured  on  a  known 
system  to  their  corresponding  values  on  the 
reference  rorf igurftt ion,  end  it  is  not  diffi¬ 
cult  to  follow  Che  steps  outlined  in  this 
paper  and  redevelop  the  computer  throughput 
model  using  some  other  reference  configuration 
which  is  more  convenient  to  the  user.  Thus, 
in  a  broad  6ense,  the  results  presented  here 
should  he  of  interest  both  to  the  generalist 
and  the  specialist.  It  is  anticipated  that  the 
model  will  be  a  useful  tool  in  the  evaluation 
of  proposed  changes  to  existing  networks,  as  an 
aid  iii  the  design  of  new  networks,  and  in 
understand lng  the  behavior  of  computer  networks 
in  more  general  ways. 
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APPENDIX  B.  VALIDATION  OF  THE  CACTOS  MODEL 

Three  major  parts  of  the  CACTOS  model  were  addressed  in  the  validation  of  the 
model  equations.  " hese  were: 

■  Communication  analysis  equations 
o  Hardware  computation  analysis  equations 
6  Software  computation  analysis  equations 

The  communication  analysis  methodology  was  based  on  the  work  of  Kleinrock  [1] 
and  others;  Kleinrock  discusses  validation  of  the  communication  analysis.  In 
addition,  the  Project  conducted  further  validation  experiments  using  a  discrete 
simulation  model  based  on  ECSS ,  a  computer  simulation  language  in  S1MSCR1PT 
developed  at  the  RANJ)  Corporation. 

The  hardware  computation  validation  consisted  of  examining  the  equation  for 
computer  throughput  based  on  various  hardware  parameters.  This  throughput 
equation  was  parameterized  in  that  the  exponent  of  core  memory  was  undefined. 

The  reason  for  this  was  that  the  contribution  of  the  other  hardware  features 
was  better  defined.  To  perform  the  validation,  a  set  of  processing  runs  from 
a  variety  of  configurations  for  the  same  programs  was  needed.  One  program 
that  exactly  satisfies  this  criterion  is  the  IBM  Sort/Merge  program.  Twenty 
configurations  were  used;  they  are  shown  in  Table  B-l.  Calibration  of  the 
parameter  associated  with  core  storage  was  performed  on  the  fifth  configuration. 

The  range  of  dispersion  in  percentage  varied  from  2%-45%.  in  three  cases,  the 
dispersion  exceeded  23%.  For  these  cases,  the  situation  was  small  core  size 
with  2311  and  3330  disk  units.  The  fit  improved  as  core  increased.  This  was 
mosC  important,  since  the  experiments  involved  larger  core  sizes  than  those  in 
Table  B-l. 
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TABLE  B-l. 

COMPUTER  CONFIGURATION 

FOR  VALIDATION 

RUNS 

Configuration 

Computer 

Core 

Disk 

1 

360/30 

38K 

2311 

2 

360/50 

44K 

2311 

3 

360/50 

44K 

2314 

4 

360/50 

100K 

2311 

5 

360/50 

100K 

2314 

6 

360/50 

200K 

2311 

7 

360/50 

200K 

2314 

8 

360/65 

100K 

2311 

9 

360/65 

10  OK 

2314 

10 

360/65 

ZOOK 

2311 

11 

360/65 

200K 

2314 

12 

370/155 

44K 

2311 

13 

370/155 

44K 

2314 

14 

370/155 

44k 

3330 

15 

370/155 

100K 

2311 

16 

370/155 

100K 

2314 

17 

370/155 

10  OK 

3330 

18 

370/155 

200K 

2311 

19 

370/155 

200K 

2314 

20 

370/155 

200K 

3330 
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It.  can  be  noted  that  this  represents  a  shortcoming  of  an  analytic  approach 
versus  a  discrete  one,  in  that  accuracy  across  wider  ranges  of  parameters  is 
possible  in  the  discrete  case.  Had  the  experiments  been  oriented  toward  a 
close  fit  at  every  core  size,  several  equations  could  have  been  employed. 

The  third  part  of  the  validation  of  the  software  aspects  of  computation 
includes  record  size  and  the  relationship  between  CPU  and  I/O  in  terms  of  over¬ 
lap  and  balance.  To  validate  this,  the  JOVIAL  program  shown  in  Figure  B-l  was 
constructed.  The  purpose  of  the  program  is  to  carry  out  a  specified  number  of 
CPU  and  I/O  operations  while  timing  itself  internally.  (JOVIAL  penults  such 
timing.)  For  a  variety  cf  I/O  and  CPU  balances  and  overlaps,  the  results  of 
the  model  and  program  were  compared .  The  reaiuta  are  summarized  in  Figure  B-2 . 
In  this  figure  the  horizontal  axis  is  the  experiment  number  while  the  vertical 
axis  is  the  time  in  seconds.  The  points  labeled  with  an  X  are  thoue  of  the 
program.  The  program  was  run  on  a  370/155.  The  dispersion  for  all  but  one 
cnee  is  lees  than  20%.  Since  the  direction  of  the  times  and  incremental  rise 
for  both  the  program  and  the  model  was  the  same,  this  was  felt  to  be  adequate. 
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Figure  B-2.  Comparison  of  CACTOS  Model  and  Validation  Program  for  CPU  Overlap  and  Balance 


